Hub: C-5
Integrations: Alexa, Maker API (homebridge), Maker API (NodeRed)
I feel for the last month Ive been having a locked hub (require power cycle) almost weekly. Prior to that, it was about once per month.
What are the best logs to look at? I have HUB INFORMATION driver installed, but have not been watching for memory/load and rebooting (or having a scheduled reboot).
Would like to try to narrow down cause before just rebooting.
I'd start with a soft reset to clean up your database if you've had that many hard reboots. Then I would check the logs in general. Look for chatty devices, too much power reporting...The next time it happens tag @bobbyd or @gopher.ny and PM them your MAC or UID so they can take a look at your engineering logs.
I'm having this same issue. Started a week or two ago.
I already have my hub configured to reboot nightly, but even then I still have to power cycle intra-daily. It's currently in that frozen state now. Very frustrating.
The first symptom was a DB corruption error, so I did the soft reset and reverted to a backup from about 3 days earlier. That seemed fine for a day or two, but now I'm back to daily freezes.
(Also C-5 hub. Integrations with Alexa, Ring, Maker API (HomeBridge), IFTTT, SmartThings, and probably a couple others I can't recall.)
I would suggest you try removing the cloud integrations one at a time and slowly reintegrate them to see what happens.
I have personally killed my dev hub with node-red when working on creating a UPS Flow. TCP seems to be ruff on the CPU. I am not sure why but when I look at the "App Stats" tab in Logs section of the UI anything that uses lan connectivity to connect to cloud or local devices generally use more resources or time.
Also running a Soft reset if you hare experiencing hard resets does help as well. Sometimes it takes a few to clean up everything.
A healthy environment really shouldn't require reboots other then typical reboots with updates. I almost never force a restart of my hub.
I have the same issue since last 2 weeks for my C-4 hub. This morning it didn't run morning rule and I have to do soft reset. It could be accessed by http port 8081, but click reboot not working.. have to power off and cool down for 1 -2 minutes. (No Zwave, only 2 devices for Zigbee)
It looks clean (just making sure nothing amiss in there hampering the mesh (which can cause lockups))
@mavrrick58 advised start with that, but another thing I would look at is what's beating up your maker API... I've seen a lot of lockups due to that getting hit hard and choking the system.
Ghost/stranded nodes aren't shown. Two years ago, my C-5 had a "clean" z-wave details page, but the engineering logs picked up half-a-dozen ghost/stranded nodes that I used a secondary controller to remove.
This was exactly my point with my example. It isn't hard to create a perfectly working flow with Node-Red that can blow up the hub by flooding it with unnecessary udpates.
Whenever I've seen ghosts (regular ones) on my c5 there were no in/out nodes I'm not saying you're not right (likely you are) just what my experience had been... To be fair I haven't fired up my c5 in a while
I would do anything that uses TCP to connect. So from what I see it would be alexa and maker api
Also if you only run one instance of make API you may want to install two instances of it so you can see which integration is creating the load between Homebridge and Node-Red.
I would also let it run for a while and then check the stats pages under the Log section.
You may also want to look at the Hub Info device to see what it is reporting for CPU Usage. Do you log any of that stuff externally. Maybe look for a trend for CPU to spike.
I had a bunch of them at the beginning a few years back when I was just starting out and bobby went over things. I ended up wiping the hub it was so bad... Then I learned about the hidden stranded nodes and still found some. That was about the time I switched to the c7