C8 Hub Lockup weekly

I'm about a month in as a new user of Habitat, having come over from Wink. I'm loving the Habitat platform but I am disappointingly experiencing a lockup about 1x per week. I notice it when one of my daily automations doesn't run.

Here are some details on my HUB:
Platform version: 2.3.6.146
Hardware version: Rev C-8
Diagnostics tool version: 1.1.113

Its connected via hardwired ethernet with a static IP address. Its powered by the supplied power brick. My home network is a mix of hardwired and wireless. All the hardwired devices including the hub go to a central switch as do 3 WiFi Bridges in the house.
There are about 16 zwave devices, including 1 repeater. And 4 Zigbee devices connect to the hub. There are a few virtual switches and virtual dimmers, as well as some Group switch and group dimmer devices. And there are several WiFi connected devices linked to various apps, including iAquaLink pool integration, Logitech Harmony Hub integration, PixelBlaze Controller integration, Echo Speaks devices, Honeywell Thermostat, OpenWeatherMap. I have really enjoyed the ease with which so many things integrate with Habitat.

Now regarding my problem ...
The second to last time the hub locked up, I noticed that the webpage could not be accessed, the hub could not be pinged. I didn't know to try to access the diagnostic page, so I didn't try that. My 11pm automation didn't run, and so when I looked at the logs I could see that all logging had stopped at 5:46pm. The last log was an innocuous Phillip Hue Motion sensor LUX report. No errors reported.
After rebooting the hub (hard power cycle). I backed up and restored the database. That was 6 days ago.

Today my Sunrise automations did not run. I tried to access the hub webpage and it was unresponsive. But I could ping the hub, and I could access the diagnostics page. This time I rebooted it from the diagnostics page and it booted up successfully and everything worked.

This time the logs do NOT show any gap. Logs are continuous from the 11pm automation that ran all the way to the reboot, but conspicuously missing any indication of the sunrise automation that should have run but did not. It was a windy night so there was a lot of outdoor motion sensor activity reported, The last Honeywell TotalConnect update occurred at 1;34am. Typically that runs every 10 min. There was nothing unusual about that last Honeywell TCC update in the logs. So something happened after 1:34 where the Honeywell TCC stopped, my rules stopped running, the webpage became un responsive, etc. Only the 2 zigbee outdoor motion sensors as well as 2 zwave garage door controllers were reporting status in the log during that time.
So this time it did not lockup completely, but something crashed pretty hard leaving other things still running.

Can you give me any advice on what I should do to further diagnose whether this is a APP related failure or a hardware related failure? I really need my automation to be stable over the long haul, having to reboot 1x per week is unacceptable.

I should add that since the previous lockup 6 days ago to the latest one today I did not add any new apps or devices or rules. The hub was just humming along until it wasn't.

Thanks for the detailed post. It really helps us understand better what the issue may be. Have you performed a Soft Reset to see if that helps: Soft Reset | Hubitat Documentation

If that doesn't help, please send me a private message and/or create a warranty case by visiting support.hubitat.com so we can take a look at your hub's engineering log for additional clues (assuming your hub is connected to the cloud).

2 Likes

@greenbiker Do you have jumbo frames enabled in the network anywhere?

Do you have any zooz 4-in-1 sensors or zen-25 outlets?

Can you post your z-wave details page in its entirety? (use windows snip)

Can you look at the hub memory endpoint and see if your memory at anytime during the lockup (or near it) get below 100,000? Go to http://hubs.ip.address.here/hub/advanced/freeOSMemoryHistory

1 Like

Thanks for the tips. I'll try a soft reset.
in response to @rlithgow1, No I do not have jumbo frames enabled on my network.
I do have a single Zooz 700 Series Z-Wave Plus Smart Plug ZEN04 in my system. That is the only Zooz device. Curious why you asked about specific Zooz devices?
I can post my ZWave details at a later time, no access at the moment.
Thanks for the tip on checking the memory usage. I'll check that out as well.

Both devices I mention can crash the hub

1 Like

Thanks!
ZWave details attached. Memory usage history never goes below about 417,000 (the latest entry in the log). BUT there is only about 3 days of history since my last reboot. I do notice that it is decreasing over time. Here are 3 samples 24 hours apart:
12/9 8:35am : 524,140 (right after reboot)
12/10 8:35am: 440,128
12/11 8;35am: 426,472
That seems to be indicative of a slow memory leak.