Zigbee Radio - Intermittent Brief "Off"s

Over the last few weeks the Zigbee radio on my hub (a C-7 running 2.3.5.152) has been turning itself off for around 6.5 seconds on an apparently random basis (log pasted further down). Once it turned itself off and stayed off until I rebooted.

I added a few Zigbee devices a while back (a couple of 4-way TS004F scene switches and a Zemismart 3-way wall switch), but don't think the misbehaviour coincides with that (and FWIW there've been other TS004Fs on the system for at least a year).

Other than this Zigbee oddness everything appears to be working normally, and the general logs only have the occasional driver error (a NullPointer error for an ethernet-based Evohome thermostat system).

I've searched the forums and found a few mentions of similar behaviour, but no established cause or fix. Does anyone have any thoughts?

A C-7. Interesting.

At least you have an idea of might cause it, with your recent changes.

This Zemismart 3-way wall switch: what are the specs on that?
Does it use an older zigbee protocol? (I forget the series, is it 500 vs 700, or is that Z-Wave?)

That's z-wave. Zigbee is either zll or zha 1.2 or 3.0

1 Like

Typically vague, unfortunately - ordered from Amazon.co.uk, with few specifics on technical details:

https://www.amazon.co.uk/dp/B07ZM3CBHP

AFAIK the 500/700 business is Z-Wave.

I'm using kkossev's updated Muxa driver with it, and have had no problems at all with the switch itself while using that (and no errors or warnings for it in the logs, just debug entries):

I imagine @kkossev will say it's the latest.

But still, I think this is the first C-7 on the forum that's behaved like this.
And, you know you added three devices just previous.
Two of the devices were the same as you already had.
So, how about removing the 3-way for a little while?
I'm not sure how much "removing" would be required; probably all wired up now.
If it's still powered, even if not on the mesh, perhaps it could still raise havoc?

raises hand - sorry but although I can't show logs to prove this - this was one of my symptoms during the great c-8 zigbee crisis. (and yes, i'm c7 and 100% Zigbee. don't even turn on the zwave antennae)

In the end, removing some apps and devices was the fix but I didn't isolate, I just moved on. I too was using some @kkossev drivers (zemismart) and Tuya items. It wasn't them. for ME, it was taking out the Power monitoring app, power monitoring outlets and some older zibee items. XCTU was very helpful at that time as well. I initially blamed HE, but just gave up and moved on.

There are some earlier similarities; in this thread a C7's off for a few minutes rather than seconds. Same here, with hub model not listed (too early to be a C-8, though!).

In both cases "hub overworked" was listed as a possible reason but AFAICS not confirmed as such - I've checked load on my own hub, and it's low.

Removing the 3-way switch from the mesh would be doable (disconnecting it would be more of a challenge), but in the first instance - and perhaps unfairly - this is feeling more like a hub-level issue rather than something caused by an external device.

My Zigbee mesh has 50 devices in total (around 6 of them disconnected - Christmas outlets!) and other than those recent additions hasn't changed for some time.

Thanks for the info - I've just checked my outlets, and FWIW those that support power monitoring in the first place have already had it disabled (I guess I decided to play safe with that a while back, then forgot about doing so :man_shrugging:), and have no monitoring app.

It's looking increasingly like temporarily taking the Zemismart off the network may be worth at least trying...

I'm being bold here - but when I struggled (and it was a mighty struggle with fingers pointing everywhere) I ultimately started tearing down - disabling devices and apps until I reached stability. In my case, and I don't recall exactly anymore it ended up being and app that caused it. there were no errors in logs or anything, it was disabling some of the apps and waiting 2-3 days to see if it worked that ultimately got me past this horrid time (I was a mess for nearly 2 months when c8 came out and I was certain HE was at fault - they weren't or if they were, some/all of the fixes they were putting in at that time (timing issues I think? some other stuff? have to go back and read the threads)

I can't recall proof for even one single case where a Zigbee or Z-wave driver was the reason for Zigbee radio-off issues. All these drivers are event driven, they process the incoming messages quickly and exit.

Likely it was for something LAN based. At least that's what I've found when that happens.

2 Likes

Is that LAN activity from the hub itself, or general LAN activity impacting on the hub?

FWIW I've got very little LAN-based on the hub; the Evohome app and devices mentioned above, and a few Kasa outlets using the built-in app and drivers. Both have been in place since last autumn.

"Just in case", early this afternoon I tried completely uninstalling the Evohome setup (we can get by without central heating management in August :grinning:).

After that, and later in the day (it's now late evening local time) I've checked the logs, and unhelpfully there've been no Zigbee offages at all since midnight yesterday so it's hard to be sure whether removing the Evohome's made a difference.

I'll continue to keep an eye on things.

Lan integrations. For instance in my case it was caused by my iotawatt being off and the integration kept trying to reach it with no errors in the logs. Once my iotawatt came back on line all was good. The zigbee radio is the 1st thing to shutdown on hub overload.

(Final?) Update on this - after a few weeks of experimentation (including a soft reset when we were away from the house for a few days and DB restoration on return, to try to verify it wasn't a hardware issue) the cause of the Zigbee off/ons has provisionally been pegged as this app/device driver for a Google Nest thermostat. Something I'd forgotten was in place[1], although FWIW these did seem to be working as they should in themselves.

Not 100.0% sure, as it wasn't practical to wait a few days to see whether new off/on log entries appeared after every individual change, but after uninstalling a set of "definitely not in use" apps and the Google one there've been no recurrences for 3 days.

[1] Realised it was thing after seeing "Received cloud request for App xxx that does not exist" after doing the soft reset, then checking which app xxx was after the restore.

1 Like

Just out of curiosity - I certainly won't try to claim that the Nest integration is fault-free - but a few questions and tips that may help if you still want/need to connect your thermostat and/or cameras ..

Were you running the latest code? There was a nasty bug a little while back which could bring the hub to a crawl, because of a rogue log statement which tried to log an mp4 clip :upside_down_face:

If on the latest version, I'd make the following suggestions if you want to try re-adding it:

  • Leave debug logs off for the App (they can get verbose, and maybe there's another odd one hiding)
  • Turn off device settings for image captures -perhaps the download of this data was slowing things just enough?
  • Likewise turn off Google Drive in App settings - the re-upload to Google again may have incurred some slowdown?

Then you could re-enable these portions 1-by-1, if desired, to see if it re-introduces the problem.

1 Like

Interesting. Whenever I do a soft reset, I immediately do a restore. I've never checked the logs.

I think so - I had 1.0.8, which according to the importURL is the latest.

As it's turned out Nest handling is no longer needed for our Hubitat setup - it was very useful (and thanks for making it available! :1st_place_medal:) in our previous home, but in our current one we've switched to Evohome. It was continuing to monitor a second Nest used by a family member (so didn't have any null/stale device entries), but that was more an oversight than a requirement and once I'd realised it was still active the most practical option was to remove it completely rather than attempt more detailed troubleshooting.

Just when you thought it was safe to go back in the Hub Events list...

This is more for people's info than anything else, and to apologise to @dkilgore90 for unjustly suspecting his driver.

After several days of no Zigbee radio issues I put the Evohome app/driver combo back on...and the off/on's returned. So I removed the Evohome items...and the off/on's continued. :confused:

So, there's a lot of speculation in this, but possibly my hub was teetering on the edge of being overloaded and near enough anything extra added will start it misbehaving? Still no idea of why it might have a high load - near enough all my apps have triggers from specific devices (mainly motion sensors, some contact ones) rather than running periodically in the background, and some of the off/on's happen in the early hours of the morning (3-4am) when there'll be next to no triggers anyway, but the logs are what they are. :man_shrugging:

I'll continue to scratch my head over this, run what tests occur to me, and keep people updated on any progress,

Well, an update, but not a very useful one - after trying/checking various things:

  • The off/on's happening don't seem to be correlated with general hub activity levels. There can be stretches of several days of regular house activity when none occur, and also half a dozen off/on's over an hour at a time when there's nothing to speak of going on (early hours of the morning, house empty in the day, etc...).

  • Looking at logs there's no correlation between off/on's and any other event occurring (for a device or device type) that I can see - the off/on's can happen well clear of any other event being listed.

  • Just to reiterate something covered above, the only internet-related app I have running at the moment is the stock Kasa one, controlling 3 outlets (only 1 of which is plugged in right now), and power monitoring is turned off for everything that supports it.

Off's can happen while normal activity's taking place (determined by checking logs after the event), and the occasion failure of motion sensors, voice commands, etc... to control lights is impacting on the WAF.

After a bit of searching I've seen people with C-7s and C-8s reporting this in other threads, and while there's user speculation on possible (probable?) causes AFAICS there's no recognised solution or official position on it.

With the WAF in particular in mind, can I ask @bobbyD where things stand with this behaviour at Hubitat? Are there any recommended fixes, or failing that any further tests I could run to try to identify the cause? Very happy to supply log details/DB backups/anything else if that would help.