I have a C7 and running 2.3.9.166 right now. I didn't want to hijack the other current ZWAVE thread running here, also interesting to me.
So first off, thanks Jeff @jtp10181 once again for your post on 'before you post'.
[‼ READ FIRST - Before Posting in Get Help]
as that had a key element for me. More on that later.
A few nights ago I was about to post a ZWAVE SOS thread as things went to hell quickly on my normally stable zwave network. I added a new Zooz ZEN71 800LR switch to my network (paired in mesh/zwave plus as I've only got a C7). I added it with mesh enabled, no security.
That seemed to go fine and normal, but while checking my Z-Wave map and table, I noticed one of my devices (A Leviton Fan Controller ZW4SF) had a last response time of 6 days prior. It controls a ceiling fan and is nestled in a 4 gang with 3 other leviton DZ6HD controllers (which were working fine). I couldn't remember the last time I tried to control it remotely. Testing it then from my phone, there was no response.
The 6 day stale date was interesting to me because I'm pretty sure that correlated to when I upgraded to 2.3.9.166. I stayed on 32.3.8 for a long time because I had to be rock stable during this period. So I was initially wondering if that might be related. I dont know if its relevant or not. I hadn't added zwave devices in a looong time, and was 2.3.8 for a looong time. So maybe correlated, maybe not.
I tried a zwave repair on the Fan controller, that failed (timed out on neighbors). And that sort of set of chain reaction then on said neighbors going offline. I tried a few more repairs on those, and the network (among those neighbors) was soon falling apart. I have a large amount of devices in this area (mostly various light controllers) that are literally 4 to 12 nautical feet from the hub.
Pretty soon things were getting bad and I was a bit concerned. So I tried to restart the Hub. That didn't help and a few more repair tries continued to degrade the network. After doing my diligence to search for the issues and help, I was about to post and saw Jeff's comment that a simple Hub restart does not reset the radio h/w. That never occurred to me as in my job life, we make sure to design that in for all radios in our system, and also have external means (dedicated hardware or a small monitoring controller) so that we can restart our radios in the event that an app, the OS, or our processor(s)/device hang and die.
SO... I did a hard shutdown and power down per the recommendation. After coming back up, the fan controller and other devices were working.
The next night, I added two more Zooz ZEN71 switches (to the same room/area). Same thing happened; A few devices went offline, zwave repairs failed and just took down more devices.
While debugging and testing, I enabled mesh on some powered devices that previously weren't enabled for mesh in this area (I have many and the space isn't that large). In the early days I was trying to control limit the hops that routes would take in this area by not enabling all the repeaters.
That didn't help and if anything maybe made things worse. SO I repeated the prior nights procedure. Hard shutdown, pull power wait... reboot.
When the Hub came back up, the network was more streamlined than I have ever seen it. Far more of the devices that 'should' be a single hop (literally feet from the hub), were direct connected; the devices which did take multi hops mostly made sense to me based on their distance/location etc. upstairs, or farther away with repeater devices in the general line of radio sight to their location.
I haven't done extensive testing yet tonight, and I'm not adding more devices for a bit and especially until I get a better feel for whats going on.
I'm not sure what to make of all this yet, but wanted to share to get thoughts and/or add more data if there is something going on more wide scale.