Automations stop working two hours or so after restart

Hi Guys,
new to habitat since c-7 became available this spring. In the last week or so, I have been having a rather large issue where none of my automations/controls will work after the hub has been running for a couple hours.
for example:

"hank" zigbee button used to turn off bedroom light and fans (BOND hub, zigbee plug and Kasa plug), no response. (these devices work fine when directly controlled by Alexa)

"zooz" motion sensors stop turning on the kitchen LEDs (Kasa plug) and zooz hall lights.

The problem seems to be with the HUB itself, or with the rule machine/motion lighting/simple automations. Buttons become completely unresponsive, no logging or even battery level.

things I've tried:

  1. restart Hub - when i do this everything works for about two hours before going unresponsive again.
  2. restore Hub - since I only had the auto restore points the hub makes on it's own, I was only able to go back 5 days. problem still persisted.
  3. I tried to think of thing's I've changed in the last week or so, but I've only been messing around with dashboards on a fire tablet, and the "Smartly JSInject" for CSS. I also integrated a Lutron Pro Hub, but i haven't done anything with it yet.
  4. I've let the logs run, but honestly I'm not sure what I'm looking for that would cause the whole "brain" of the Hub go unresponsive.

Thank you so much for any suggestions, at this point i was just considering wiping the HUB and starting over.

Somebody will come along shortly and ask to see logs, but I’ll ask simpler questions to start:

  • You mention restoring a backup. Did you do a soft reset first? The soft reset means that the database gets completely rebuilt when you restore. This often seems to clear up these sorts of problems if they’re casuals by DB corruption. I don’t think you get the same benefit from just restoring the backup.

https://docs.hubitat.com/index.php?title=Soft_Reset

  • You mention rebooting. Did you shutdown and pull power for 20-30 seconds? Just rebooting never shuts down the radios, so it won’t help if you have a radio problem.

thanks for the reply!
I did both a reboot, and the most recently (when i was looking on these forums) i tried a power off, plug out for 1min and power back on. I'm not home currently, but ill try the soft reset and restore later today.

Each of the symptoms you describe lead me to think it must be something like resources getting consumed, e.g. memory or cpu. After trying the soft reset and restore, I'd start by trying to turn off / disable custom apps/drivers one by one and see if you can identify a culprit. You could then move on to turning off rules a couple at a time.

If you still have no luck you may want to look at a few of the "watchdog" style apps to try and work out where the problem may be.

Check settings zwave details and look for any devices without anything in the clusters column besides the first which is the hub radio. I often see the symptoms you describe when there is a failed pair. Removing the ghost should clear up your issue if there are ghosts.

To remove the ghost shut down your hub early using the shutdown in settings, then power it back on after a minute. Go into settings zwave details and click repair on the ghost node. It may go to not responding or failed. If it goes to not responding click repair on the device again and then remove after the repair completes and the device goes to failed. If it still doesn't let you then the hub can still reach the device and was probably joined a second time. You'll need to remove power from the failed device and then run the device repair again and then remove.

2 Likes

when this problem started, i didn't have any "ghost" zwave devices. but as ive been troubleshooting. i now have three as you described. I'm also encountering a "zwave busy" error in the logs.

The zwave busy is being caused by the ghost nodes. Shut down and pull power for a minute to clear the radio and then remove the ghosts.

2 Likes

I did the shutdown and tried removing the devices, i was able to get one. the rest are stubborn little guys. what do you mean by "clear the radio?" sorry, still new at this. thanks for the help!

The zwave busy messages come from when the radio gets overwhelmed. Shutting down and pulling power is like a reboot for windows when it's acting funny.

ah ok ok, yeah ive been doing a bunch of shut downs in the last two days, as i try to work through this. still can't get rid of the last couple phantom devices though. is there a way to force remove? the devices are monoprice contact door switches, which showed non responding so i removed i excluded them and tried re-adding. i couldn't get them to show up when running an inclusion, but apparently some part of them was added.

Make sure they are powered down. As long as the hub can ping them it won't let you remove them.

2 Likes

just wanna say thanks man, i removed power from the contact sensors, shut down the hub. individually excluded each one, even though they weren't showing up. let the hub sit for a few hours to repair the network and then re-added each sensor. so far so good with the non responsive problem, we shall see.

2 Likes