Reboot vs. Shutdown to Repair Zigbee Network

Over the course of a week or so, various Zigbee devices we’re using stop responding. My solution is to shutdown the hub, kill the power for 30 minutes and restart the hub. Once I do this, all is well until it’s not.

Does anyone know if a simple reboot would accomplish the same thing? It would be easier to schedule a weekly, middle of the night reboot instead of shutting down the hub and waiting 30 minutes? Or if there is any easier solution, I’d love to hear it. I'm running a C7 hub on the latest firmware.

TIA for any guidance anyone can offer.

30 mins is excessive, in my opinion. The goal is to let the power drain internally so that the radios shutdown. 10 seconds is probably enough... but feel free to go for longer. I'm only suggesting a tiny delay in case that alters the inconvenience you're experiencing.

Power cycling the Z-Radios will cause them to reboot, which does NOT occur with a simple Shutdown.

I would come at it from a different angle.
Has this only just started happening?
Have you recently added a zigbee device?
Is there anything in the logs?
Are the devices that start to go offline first routing through a specific repeater?
Is that repeater the first device that goes awol?
etc.
etc.

Just my opinion of how I would try and troubleshoot the issue.

1 Like

I agree, the issue is not whether to reboot or shutdown, the issue is why devices stop responding to start with. That is worth of spending some time investigating to get to the root cause of the issue.

In incident management, the restart is just a corrective action to get up and running again. Now is the time for recovery phase, by investigating the root cause and finding a permanent fix for it.

2 Likes

Thanks for the suggestions to look into the root cause. I don't see anything in the logs and the most frequent offenders (highlighted in yellow) are attached directly to the hub.

Any suggestions as to how to better troubleshoot things are appreciated.

It actually is about right. However, the purpose is not to drain the power in the radios, but to put devices in a panic mode trying to find the hub. This is an extreme measure, and better/healthier mesh is likely the best sution in this case.

5 Likes

Huh, that is the first time I have had it explained that way. I've always heard it is more to drain the radios to get them to restart, and that 30-60 seconds is plenty to do that. I guess not.

"A Zigbee device typically enters "panic mode" to find a new parent router after its current one (like the hub) is gone for around 20 to 30 minutes, triggering a network heal as it starts searching for the next best connection, though some battery devices might wait longer (up to an hour) for their scheduled check-in before actively panicking."

2 Likes

No, it will not solve the problem. I would make sure your repeaters (mains power devices) are not frequently unplugged or lose power.

2 Likes

I would search the community, there are many threads out there from people troubleshooting these types of issues. Basically, you want to look at the Zigbee logs, not the Hub logs, found in the Zigbee settings. Look for unresponsive devices making no logs, and for very busy devices that are constantly churning out logs, that may be spamming the network. They are only live logs, so you have to watch them. You can send commands to devices to see in the logs how they react.

You might also want to install the Zigbee Map app to get a better visual and more info about the mesh.

Do you or your household members regulary cut the power to the bulbs from the wall switches?

When this started did you have a battery die in a battery powered device or anything happen to a battery powered device where if had to join back. I don't mean removed from the hub but the best explanation would be a dead battery, replace it, then the device connect back or any event that would cause something similar.

I ask this because I had a battery powered Third Reality vibration sensor that a battery died. When I changed the batteries and it connected back something crazy happened. While the device was detecting vibration it caused numerous bulbs and plugs to become unresponsive. They wouldn't work manually, in automations, or any method. I had to completely delete the device from HE and add it back. Now everything works normally.

There was nothing in the logs when this happened, there was no indicator in the hub events, nothing. I just happened to stumble across it with the dryer running. I have no idea how a battery powered device could do that but it did.

Are there any errors in the Logs tab, Past Logs, that pertain to any Zigbee device?

What about Location Events or Hub Events in the Logs section that might pertain to some Zigbee device or even the Zigbee radio itself?

If you go to the Settings page, Zigbee, and sort the Messages column on the right from high to low, is there any device that is way excessive in messages compared to anything else?