Running a C8 on platform version 2.3.9.200
Zigbee radio dropped offline yesterday and have been seeing multiple offline/online messages in Hub event logs. Also seeing many java errors in logs that are connected to various zigbee devices.
I powered off for ten minutes yesterday but that did not resolve issue.
Both Device and App stats show an 0.8% load
Hub Info shows a 6.5% cpu load and internal temp at 36.0 °C
The java errors seem to be the issue I need to resolve but I don't know where to begin.
After coffee and looking at the timestamps closer, the java errors are due to the zigbee radio being offline so that's not the cause.
Taking this back a bit, yesterday is the first time you noticed this? In other words, the hub was performing just fine, then all the sudden this started happening?
Have you done anything with the hub recently, added a device, updated hub firmware, added rules or added web integrations?
Anything stand out in Device Stats or App Stats in the Logs tab? Similarly, do any devices show a huge number of messages compared to the other when looking at the Zigbee Settings in the Settings tab?
Yes, there have been a steady set of changes over the last week of which I am certainly suspect of but I do keep a close eye on Device Stats and App Stats and have not seen any concerns. I updated firmware several overly chatty zigbee outlets (power monitors) several weeks ago to get the device counts down and that was successful. Since then its been mostly tweaks to rules, hub variables, room light configurations, etc to support the Christmas light season we are in.
I have not ever noticed any zigbee offline/online behavior but I also seldom look at the Hub events page so may have missed them if they had occurred. Zigbee went totally offline a couple of nights ago and lights did not turn off that night. I rebooted the hub to get the zigbee system back on line. I did read up on others having some similar issues last year so I also then tried the power off cycle.
Today some of my zigbee sensors went offline and I could not get them to reconnect so I revisited my entire setup. This included switching my home wifi networks and changing the zigbee channel to make sure it was on a clear band. I also walked though all of my zigbee devices and attempted firmware updates (12 of 35 updated). One of my zigbee outlets required a full reset and it happens to be in the area of the house where it could have been acting as a repeater to some of the failed devices so I'm suspicious about it too.
So now I'm in wait-and-see mode as the network settles out. Last zigbee offline/online was early morning at 3am. I just finished my updates 30 minutes ago so I'll report back tomorrow or if another event happens.
That device that was totally locked up is suspect, in my opinion.
Those short online/offline are not atypical, some people don't even know that it is happening due to them being so brief. It isn't good that it is doing that, but it wouldn't be the first thing I would dig into trying to diagnose.
The longer offline without recovering is where you should focus. With all those changes you made, I think you need to let things settle for a couple days, and see what happens. While I don't think it should make the hub crash, trying to do too many things at once (I.E. dozens of firmware updates) could possibly overwhelm the radio.
You might try doing a controlled shutdown via the Settings menu, pull the power cord, and wait 20 or so minutes before rebooting. Interrupting power for a few seconds will clear the radio, and the longer wait will put the Zigbee devices into panic mode and cause them to seek new routes once you power up the hub. I can't say whether this will help or not, or even advise you to do it now vs waiting a day, but it can't hurt to try.
Ok, good ideas.
BTW, I did the zigbee device updates one at a time. Doubling up on a few after noticing no impact to the update time.
I'm going to wait at least a day before doing anything else.
1 Like
All good for the last 24 hours. No offline/online occurrences and all devices working.
My guess is that it was that device that had locked up. Maybe it was spewing messages??? I didn't see anything in the Device Stats but I didn't look before rebooting after zigbee went offline.
Anyways thanks for the support! Sure great that we have folks like you in active community!