Hub crashing when on vacation

My hub has crashed (non-responsive, power cycle brings it back just fine) both of the last two times I left for vacation. I don't have anything triggered to run while I'm gone, I don't use presence, and I can't find anything helpful in my logs. I discovered the crashes both times when trying to operate devices remotely but my hub was not responding.

What's the right way to debug this?

here's all I have on my hub

  • Ecobee Integration
  • Google Home (this is inactive)
  • Google Home Community
  • Hubigraphs
  • Hubitat Package Manager
  • Hubitat® Dashboard
  • Hubitat® Safety Monitor
  • Motion and Mode Lighting Apps
  • MyQ Lite
  • Rule Machine
  • Rule-4.1
  • Simple Automation Rule 1.1

Hubitat Elevation® Platform Version 2.2.7.128
Hardware Version Rev C-7

Go to the Runtime Stats tab. Check the Device stats and App stats, and see if anything in particular is eating lots of memory, or shows in red.

Same for Logs tab, anything showing errors?

You also could do a scheduled reboot every couple days (or every night?) and see if that at least keeps things up and running. This might help if it goes say 2 days, and crashes, and happens on a specific interval. If it is more random times it locks up, this probably won't help. And this isn't a permanent fix by any means. That might take more diagnosis.

I have seen this crash a hub before, especially if polling is set wrong. I don't know what the right way to set it is, but there are a couple threads from probably a month or two ago that went over this integration chewing up memory.

Remove it if you aren't using it, or any other apps. Keep the variables down for now.

I don't know if I have seen any complaints about this, but I don't follow this app too close. It is another non-built in (community) app like Ecobee, and there is always a chance that it could be doing something to cause the crash. At the very least, keep a close watch on the app stats, and logs, and see if you spot anything suspicious.

If you can't make any progress, contact support and see if they can see any errors popping up on the diagnostic side of your hub.

What does red mean? I would assume "bad" but I'd like a bit more detail. My core Ecobee seems to be far and aware the most intensive item, however, there are a few others showing red for "Events" and I wonder how concerning that is?

Yes.

I am not sure exactly what triggers red.

Then I would concentrate on that first. See if this helps. Severe CPU Load? - #60 by storageanarchy And [DEPRECATED] Universal Ecobee Suite, Version 1.8.01 - #538 by storageanarchy and [DEPRECATED] Universal Ecobee Suite, Version 1.8.01 - #543 by storageanarchy

I think anything red is concerning, the hub has flagged it for heavy load. Whether one thing is in and of itself bad enough to kill the hub, hard to tell. But if you have multiple things in red, it surely is going to point to issues that you need to fix.

I would look at the last link above, and see what your events are set to. I personally think the lower limit you should set them to (11) is a good place to start unless you have a reason to store dozens if not hundreds of past events. I had junk going back a year, for no reason whatsoever. I wasn't going to go back and see what the temperature of the hall closet was a year ago... The hub seems much snappier after getting rid of the old stuff.

awesome info, thx. I haven't read those threads yet but I organically found SANdood's Ecobee stuff last night so I disabled the OEM integration and put his in place. I think I'm observing a lower numbers now, but my main thermostat is still the most intensive object in my platform. I'll dig into this more tonight.

The next most intensive items are 2 RGBW bulbs, I forget their brand offhand (Sylvania?) but they're integrated as Generic ZigBee RGBW Light

A couple of notes:

  1. Red does NOT mean "bad" - it just means that the usage is higher than some internal max limit. Large applications such as my Ecobee Suite and Echo Speaks use a LOT of stateful data so as to minimize CPU overhead. Fact is, using a lot of memory is preferable to using a lot of CPU, since the latter can cause missed or delayed events from any/all of your devices. I have carefully tuned my Ecobee Suite to be extremely Hubitat friendly.

  2. Given that the Ecobee Suite collects, monitors, maintains, and updates over 300 attributes of EACH Ecobee thermostat you use, please a few dozen for each sensor, you should not be surprised that it is resource intensive. The benefit of this a is an EXTREMELY resilient integration of Ecobee thermostats and sensors, much more so than the stock interface provided by Ecobee (which isn't "bad" either, just not as robust as my Ecobee Suite).

  3. Things are "bad" if the TOTAL load on your hub is too large, not any single individual device. Should you find that your TOTAL load is exceeding 10-15% of the CPU time on your hub, and/or you find less than 100,000KB of available memory, THEN you should start looking to reduce your load. It may be that you need to consider adding a second Hubitat hub to run your "intense" applications, and then using Hub Mesh to share the necessary devices back to your primary hub.

  4. DEFINITELY you should set your Ecobee Suite polling cycle to 1 minute (to minimize CPU overhead - trust me, in this case More is Less).

  5. DEFINITELY you should set your Event and State histories to 11 (to minimize memory utilization). This actually could be applied to every single device in your environment.

  6. YMMV

3 Likes
  1. understood, good info
  2. I like it!
  3. thank you for the benchmarks. I'm not exactly sure where to find those totals (all I see is the runtime stats page with no summary), but I'll figure it out.
  4. Done!
  5. Cool... how? I'm happy to do this for all my devices but don't see a way.
  6. yes, yes it will. lol. I'm in for the ride!

Right, red just means some value is above certain (and somewhat arbitrary) threshold. Some of the more complex apps get highlighted in it by the nature of their complexity.

Yup, that.

Total load can be higher - it starts to affect radios at well over 100% levels, which is when "elevated load" alert pops up as well. As for the memory, 100M remaining means hub is about to start slowing down.

Use these two endpoints: http://your.hubs.ip.here/hub/advanced/deviceStateHistorySize/11 and http://your.hubs.ip.here/hub/advanced/event/limit/11. There will be a proper UI for them in a future release.

3 Likes

The stats page shows the CPU time vs. Total time in the banner immediately above the data table

There is a URL command now, and a option to change globally on the runtime stats pages in the release of Hubitat firmware (2.2.8.***), coming soon.

3 Likes

So it's been 6 days since I last made changes (switching to Ecobee Suite and lowering poll to 1min) and, unfortunately, my hub just crashed again. I did not yet change my Event and State histories but, given the crash, I will do that now and see how long I can go.

Maybe a reboot twice a week, then. There's a community hub rebooter app that makes scheduling reboots super easy. Lowering event/state history to 11 (if running 2.2.7 or below) or 5 (if running 2.2.8 or above) should help, although it's hard to say just to what extent.

1 Like

That's an interesting nugget... I set my event/state history to 11 while on 2.2.7, and I've been wondering if that value was still "good" now that I'm on 2.2.8 (given that there were some database improvements).

Would you recommend proactively changing from 11 -> 5, or just let 11 ride and see how it plays out?

Thanks!

I'd say don't worry about it unless there are hundreds of apps/drivers on the hub. Also, if there's an app/driver that stores things like images in state, those values should be set to minimum to keep images from clogging up the works. For the vast majority of setups, the difference between 5 and 11 will not be noticeable.

2 Likes

Good stuff - thank you!

So it's been a while and I'm coming back here mostly because my hub still crashes regularly. It's ridiculous and to the point where I'm ready to go back to Smart Things. I'm an extremely lightweight user and only have a few apps installed so the idea HE crashes so much and doesn't have a way to understand the crashes is extremely frustrating. I wish I could love the system more!

About 3 months ago I got tired of walking to my AV area to power cycle the hub so, somewhat ironically, I got a stand alone Kasa plug so I could cycle it remotely. Hub still crashes at least weekly.

About 2 months ago I completely uninstalled all Ecobee integrations. Hub still crashes at least weekly.

Seriously, what else can I do?