Model C-5 Resource Leak?

Yup, it would. :slight_smile:

All 3 of my HE hubs are physically connected to the same switch, too, so I don't think it is an external networking issue... Could be wrong, but since the hub works fine after a forced reboot I don't think I am.

I'm way over here throwing darts at the "Wild Guesses" dart board... But the fact that 8081 went away too is at a whole 'nother level based on how the Diagnostic Menu is wholly separate.

The combo suggests that the Hub's network stack got 'severed' - either by no memory or a semaphore. And my darts are landing closest to semaphore :smiley:

I've had the slowdown issues and never had hub connect installed.

I've had slowdown issues. I had hub connect installed. The slowdown didn't start until after I uninstalled it because I no longer needed it.

Well the hub made it 7 hours with hubconnect client removed, so I added it back on.

We'll see what happens. My guess is it will run fine now. LOL. Which I guess is a good thing. :smile:

This is one of the risks of using the Hubitat websocket and why the proxy server is being developed..

Check this out... This is Hub 1, which is my 2nd floor hub. 95+% of device activity flows outward to to the server. Look at the proxyEvents metrics which are over about 31 hours since I last restarted the proxy..

1,190,830 events received by the proxy from the server hub websocket but only 510 events forwarded to the Hub 1. Thats almost 1.2 million calls to parse() that have been avoided.

If you keep having Hub slowdowns I would suggest switching over to http for a while. I suspect event saturation on the websocket might be dragging the hub down.

Oh, and Happy New Year!

3 Likes

Good comments, and duly noted! And that may very well be it.

I'm testing the alpha MQTT app, which has made my event count go up (literally) exponentially the way I have it currently setup. My MQTT broker generates ~70 messages/s (3-5K/min, ~200K/hr, ~5M/day) Many of those end up creating events the way I have the MQTT app setup right now.

So it is certainly feasible that it is slowly chipping away at the resources on the remote hub, or falling behind and causing issues. Using HTTP for the HubConnect connection would likely reduce loading dramatically.

And Happy New Year to all as well! :smile:

1 Like

If that is 70 calls to parse() in the MQTT device on the hub, that is likely more than the hub can safely handle. In my stress testing of HubConnect when I developed the websocket I started to see the hub slow down before I even hit 20.

1 Like

I believe it is. And then a subsequent ~70 event handlers/second on the parent app the driver sends the events to after receiving them from the MQTT broker.

Probably not a very nice thing to do to the hub.

The MQTT app can also be setup similar to HTTP on HubConnect, which is what I will ultimately do I'm sure. I was just testing the 'out of the box' automatic HA to HE integration. But I don't think that is going to work for my system. Just too many messages to parse out/filter/act on.

2 Likes

Well, I'm happy to say that I know exactly why my C5 hub dies/loses communication. It is definitely the combination of massive # of events being created (because of my MQTT testing) and using the HubConnect client via event socket.

I can reproduce "at will" (if "at will" means reboot the hub and wait 3-6 hours). Today it took from 8:30am to 2pm to die.

It is interesting to me that I have the exact same HubConnect connection going to a C4 hub, and it does not die. Just my C5 hub.

What I have architecturally:
RadioHub1 (C4 with USB radio stick) <--> Coordinator Hub (C4 without USB radio stick) <--> RadioHub2 (C5 - the one that dies)

For the record, I don't know for a fact that it "crashes", although based on looking at the logs after reboot I am pretty sure it does. I do know that when it "messes up" I can't communicate with the hub on port 80 or 8081, and have to hard power cycle it.

@mike.maxwell @bravenel If any of that is useful for you in testing, let me know. If not, I'll just undo the HubConnect on that hub / change it from event socket to HTTP and move on with my life.

5 Likes

Thanks, this is helpful. We will definitely look into this further (may take a bit given the holiday).

1 Like

No hurry / worries. The hub that dies on my end isn't doing anything important right now. It is in a holding pattern waiting for me to add some osram bulbs to it.

So if there is anything you want me to try/data to collect from it, it is available for the tinkering.

3 Likes

Whilst my MQTT app never envisaged anywhere near that level of incoming messages the way HE implement the MQTT driver it helpfully queues messages and awaits parse to complete before calling it again. I do subsequently throw an event to handle the parse but I can see that they are generally being handled, on my system, without backlogging, rougly 80mS each but I'm guessing 10/second would start problems.

What will be happening though is that the MQTT buffer within the driver will grow on HE (I cant query how many messages are pending) and eventually problems (memory) must materialize and horrible delays in updates - Jason has mentioned 45 minutes.... I'll see what I can do in my alpha5test topic to ignore more updates

Really Jason 70/second ! ? !

First, I certainly don't mean to insinuate that there is a problem with MQTT or hubconnect. The way I have things setup right now would not be how a normal person would implement in production. This is for testing.

A couple of salient points:

  1. My MQTT broker is getting 50-70 events/second from Home Assistant into the HA StateStream topic. That is all messages (state, timestamp, other attributes, etc).
  2. On the MQTT logs in HE I see an average of 5 messages/s being logged - so the MQTT driver does a pretty good job of filtering out the non-necessary MQTT topics/pubs/subs.
  3. The C4 hub with MQTT installed is not the one that crashes or locks up. The hub with MQTT is running fine, and runs fine 24/7 even with the existing messaging load.
  4. It is the C5 hub that is connected via HubConnect socket connection to the MQTT hub that is dying - which makes it seem like it is specific to the C5 device itself.
  5. I have the exact same HubConnect socket connection going from the MQTT hub to a different production C4 hub, and it does not die or lock up. There again pointing to the C5 as the different variable.
2 Likes

:smile::smile::smile::smile::smile:

ditto

I have about 3 updates /sec into my app - I have never seen a slowdown or the queuing building up. I lightly use HubConnect.

I have never experienced slowdowns on any of my 4 hubs + ST ... but .. my hubs are all C4's and I do not have large Z networks and I don't use RM very much at all either. Just a lot of events being generated..

2 Likes

I was just briefly in the locked up C5 hub club. Missed mode changes, slow rule execution, hub completely stopped for hours before I notice and had to reboot. Turned on the logging options in the last few rules I'd modified and discovered an accidental rule recursion. Fixed the rule and now I'm out of the club!

5 Likes

Didn’t mean to infer that your app was at fault...
Quite the opposite. I was busting on JasonJoel.

I did quite a bit of benchmarking and for obvious reasons the size of the message payload is inversely proportional to throughout. For HubConnect I had no issue pushing 10 messages through parse() with no noticeable slowdown. But those were small websocket messages which are only 300-500 bytes, small enough for a single TCP packet.

When I was trying to develop a connector to listen to my UniFi controller eventsocket the hub was slowing down with just 2-4 messages per second. But the messages were several KB. I suspect the biggest culprit was parsing the JSON.

He’s a special kind of special. :grin:
I think that has got to be a record.

I have traced the slowdowns (in my environment) to code that make extensive use of HubActions which seems to be very heavy calls (versus http). I set up a test where I was sending a HubAction with UDP payload every 1/2 second and saw the hub to slow down to the point where other actions were lagging 2-3 seconds. Even sending a HubAction every second had a noticeable impact. In contrast I don’t see the same slowing sending httpGet calls at the same frequency.

Not saying that’s the root cause, just some feedback from my testing.

phew - I don't make any in my app..

I agree above 10/sec events things start to backlog.. but I think I can handle that - at least as long as it's a temporary situation.. If it's sustained the future is grim...

1 Like

No argument there. After more testing I did go back and excluded the chattier devices, not I'm at a super boring event rate of <5/s. Lame.

Side note, my C5 hub crashed again - even with HubConnect set to HTTP auth and no devices being passed. Even though I don't think HubConnect had anything to do with the crash, I removed it (again) and if the hub stays alive until tomorrow night I'll add it back on and see what happens.

Very weird.

I can easily lock my C4 up by enable continuous power metering in the driver for a few Zigbee plugs.

With 10+ power events per second (in addition to all other events from my 70 devices) I just have to wait up to 24h before the hub gets unresponsive.

3 Likes