ZWAVE Network Issues

I have a C7 and running 2.3.9.166 right now. I didn't want to hijack the other current ZWAVE thread running here, also interesting to me.

So first off, thanks Jeff @jtp10181 once again for your post on 'before you post'.
[‼ READ FIRST - Before Posting in Get Help]

as that had a key element for me. More on that later.

A few nights ago I was about to post a ZWAVE SOS thread as things went to hell quickly on my normally stable zwave network. I added a new Zooz ZEN71 800LR switch to my network (paired in mesh/zwave plus as I've only got a C7). I added it with mesh enabled, no security.

That seemed to go fine and normal, but while checking my Z-Wave map and table, I noticed one of my devices (A Leviton Fan Controller ZW4SF) had a last response time of 6 days prior. It controls a ceiling fan and is nestled in a 4 gang with 3 other leviton DZ6HD controllers (which were working fine). I couldn't remember the last time I tried to control it remotely. Testing it then from my phone, there was no response.

The 6 day stale date was interesting to me because I'm pretty sure that correlated to when I upgraded to 2.3.9.166. I stayed on 32.3.8 for a long time because I had to be rock stable during this period. So I was initially wondering if that might be related. I dont know if its relevant or not. I hadn't added zwave devices in a looong time, and was 2.3.8 for a looong time. So maybe correlated, maybe not.

I tried a zwave repair on the Fan controller, that failed (timed out on neighbors). And that sort of set of chain reaction then on said neighbors going offline. I tried a few more repairs on those, and the network (among those neighbors) was soon falling apart. I have a large amount of devices in this area (mostly various light controllers) that are literally 4 to 12 nautical feet from the hub.

Pretty soon things were getting bad and I was a bit concerned. So I tried to restart the Hub. That didn't help and a few more repair tries continued to degrade the network. After doing my diligence to search for the issues and help, I was about to post and saw Jeff's comment that a simple Hub restart does not reset the radio h/w. That never occurred to me as in my job life, we make sure to design that in for all radios in our system, and also have external means (dedicated hardware or a small monitoring controller) so that we can restart our radios in the event that an app, the OS, or our processor(s)/device hang and die.

SO... I did a hard shutdown and power down per the recommendation. After coming back up, the fan controller and other devices were working.

The next night, I added two more Zooz ZEN71 switches (to the same room/area). Same thing happened; A few devices went offline, zwave repairs failed and just took down more devices.

While debugging and testing, I enabled mesh on some powered devices that previously weren't enabled for mesh in this area (I have many and the space isn't that large). In the early days I was trying to control limit the hops that routes would take in this area by not enabling all the repeaters.

That didn't help and if anything maybe made things worse. SO I repeated the prior nights procedure. Hard shutdown, pull power wait... reboot.

When the Hub came back up, the network was more streamlined than I have ever seen it. Far more of the devices that 'should' be a single hop (literally feet from the hub), were direct connected; the devices which did take multi hops mostly made sense to me based on their distance/location etc. upstairs, or farther away with repeater devices in the general line of radio sight to their location.

I haven't done extensive testing yet tonight, and I'm not adding more devices for a bit and especially until I get a better feel for whats going on.

I'm not sure what to make of all this yet, but wanted to share to get thoughts and/or add more data if there is something going on more wide scale.

This is possible? How?

From the device screen... devices I usually add first show up with MESH disabled (aka, not a repeater), here:

That doesn't do what you think it does

4 Likes

Hm, well, fair enough. I did a double take when you asked me how as you know this stuff well. . So I was half expecting a reply to that effect.

Maybe was coincidence, but when I first built the network, I was seeing that setting affect neighbor hops and routing When I changed it and did a resync. Maybe a complete herring, or enabling it sets offs something else that happens to affect the radio And it seemed to have an effect again on the network, last night. Could be its just the act of driving the configure step with a parameter change in it that is the reason.

Yeah if you only have one hub and do not have hub mesh enabled that setting literally does absolutely nothing.

But, like you said possibly the act of opening and saving the page invoked some driver settings to get pushed to the device. Depends on how the driver is coded. All that would really do is ping the device and possibly make it re-route.

Most likely a placebo effect.

3 Likes

Sorry for the mis-speak on that topic.

Interesting to me though, that the zwave radio and 'repair' function steps go off the rails here, but then the whole thing completely heals when the radio is hard restarted. That has me wondering, is that my new procedure going forward for a while (hard shutdown/restart after an add?) Things seem ok tonight so far, I have a few more things to add but I have other work to do ahead of that, and I'm leery presently.

For the Hubitat hardware and system, having the ability to restart/reset the radios would be a good thing to add going forward (may be impossible if not designed into h/w). We are so used to doing this is my world for all these reasons. Worst case you need a way to bring the comms back. If it requires a power airgap and you cant get there/to the location for a while, you're hosed.

Should not normally be needed.

I am having a similar situation right now though, not with adding devices but just the z-wave radio becoming non functional after a period of time and I have to do a hard/cold boot. Its being documented over in the beta group right now.

I think this is correct, not possible.

Yup...

I keep my hub plugged into a Wi-Fi plug for this very reason. I can cycle the power remotely if needed.

Circling back around on this item. Did they ever find a fix in beta related to the ZWAVE issue? I hit the same problem trying to add another device (after not adding devices for a few monoths since my last post above on the topic.) As soon as I tried to add it, the add failed... and repeated attempts failed. zwave repair locked up like before. I did get a ghost out of the issue this time. But discover device and delete appeared to work.

So I did the full shutdown power off, and then the zwave device added just fine (with a few gaps in device id, probably from ghost removal), and network seems to be working ok now.

I am a few versions back on my C7 (2.3.9.184) but didn't see anything specific in release notes around anything like this issue.

I think my issue ended up being resolved by swapping the power supply to a new one.

1 Like

thanks. Hmm. Makes me think that maybe mine is a different issue. It has been hard to characterize since I'm usually only adding new devices every few months or so... But this is two for two now (can't add new zwave devices after a few months of operating unless I do a full power off shutdown first). But network operates just fine with signalling and comms over that time.

I would consider that something not to worry about right now, just be glad thats all you have to do to get it to work.

Could be an issue with the underlying SiLabs z-wave stack. HE is constantly working to improve things where they can.

1 Like