I need help diagnosing a problem with my C7. Any ideas would be most welcome.
I've owned this C7 hub for a bit less than three years. Although I have seen isolated episodes of "zigbee radio offline" in the past, the hub now seems to be in a cycle of roughly two weeks of uptime after which I see the radio going offline for 7-8 seconds, then back online, repeatedly. This pattern has now been observed three times in a row, over the past 6-7 weeks.
Interestingly the offline/online sequence seems to begin right after a backup
The hub doesn't seem to be overloaded in any particular way. Here's what the last 30 days look like (reboot on 8/20 due to this online/offline sequence). Median CPU over that period was 6.75% and max 40% (sampling every 7 minutes using @thebearmay 's Hub info driver v3).
In terms of automation, I don't have anything running over that sort of time period. Device and app stats don't reveal anything obvious IMHO:
The only errors I found in the logs are these, DNS name resolution failures which I see from time to time and are nothing to worry about. No interesting warnings either:
As this hub is covered by Hub Protect, I submitted a ticket and was told that this can be due to an overloaded hub or an "incompatible device". I was advised to take down any custom drivers, then add device / drivers one by one to figure out which is problematic. If I were to proceed like this, it could take me a long time (remember : the problematic behaviour only manifests after ~2 weeks!) during which time key automations would remain broken.
The Zigbee custom drivers in use on this hub are in fairly widespread use AFAIK :
- Stelpro Ki thermostat driver by @philippe.charette (devices 1 and 2, so in use since I got the hub)
- @birdslikewires 's Aqara Temp/RH sensor driver (in use for a couple of years I think)
- @samuel.c.auclair 's Sinope thermostat drivers
- the Zemismart Zigbee blind driver maintained by @kkossev
In addition, there is a custom driver I've written myself (derived from Samuel's work) for the Sinope water heater controller. None of these drivers are giving me problems. The attached devices all appear to work normally and dare I say flawlessly for months. In fact, the Aqara and Sinope drivers have been in use in two different locations with C7 hubs, for months. The other location has not seen any zigbee offline/online messages (I think ever).
Any ideas to troubleshoot this more efficiently would be most welcome. In particular, any tips into identifying a "bad/incompatible" zigbee device that could be taking down the hub's zigbee radio.
Next step I will revert the Sinope drivers to the system versions as I can do that without too much impact to my automations, however there is no system equivalent for the blinds or Aqara devices. I will also perform a "soft reset".