C8 z-wave inclusion/exclusion issues

So another C8 Z-wave thread. For context I have a pretty large Z-wave network -45 devices, mostly smartwing shades, a couple Z-wave inovelli switches and some door locks. Day to day the network is largely stable and responsive. Been using it generally unchanged for about 2 years. When I first set it up it was a new home build and I added the first 35 or so devices very easily. I did it wrong in that I added everything one after the other in a test rig to the hub (prior to the move), then unplugged the devices and did the next device (so added 30 some devices while only having one online at a time). Initially everything paired very smoothly and easily. When the device count got high it started taking longer to pair and sometimes failing. Recognizing I was probably confusing my network, I waited till after the move, installed everything in place, powered up and let the network settle for several days. That did the trick and everything seemed to communicate very well.

However, ever since then (for 2 years now) it is nearly impossible to pair any new devices without multiple attempts. I did start using a z-wave UZB stick to troubleshoot and I've found that almost every inclusion attempt fails and creates a ghost that I need to remove before starting again. For some context in how hard it's been, every ghost devices has an ID number assigned that is never reused. My current active device ID list is pretty linear up to about #38, then is missing about 1-4 numbers every 10 devices, then jumps form 73 to 88 and 92, representing a lot of partially failed inclusion attempts.

Everything is paired with no security except 3 S0 locks and 1 (formerly 2) S2 locks.

Just today I tried to exclude and repair an S2 door lock device that was not reporting properly (ever since it's inclusion over a year ago). It excluded but left a ghost, and despite multiple attempts (right next to the hub), including multiple excludes, includes and ghost removals I was never able to get it to join the network. The hub never even found it and reported it was initializing (despite it sometimes, but not always creating a ghost). I recall having similar pains for the last few light switches I added to it 2 years ago hen we moved in, but eventually got them online. The network seems to become more unstable every time I attempt to add a device, and I actually couldn't even exclude the UZB stick (when I was done trying) until I did a hard unplug reset of the hub.

So I'm left with a reasonably well functioning Z-wave network, but feel like I'm stuck and unable to ever make any changes to it. I'm very hesitant to do any further troubleshooting lest it fail completely and I have to find the remote controls for 20 blinds (which are currently automating VERY nicely in response to summer time solar fluctuations in our many south facing windows).

From perusing other threads some pertinent info:
Hub is USB powered, but is connected to an unmanaged switch by ethernet.
I also have a large (40+ device) Zigbee network that is rock solid and still easily adds and removes devices.
No weird jumps in any device communication paths.
Running platform 2.4.1.177, Z-wave using US-LR region setting.

I do still have my old C7 I'm no longer using, as well as a Home Assistant yellow. I'm half wondering if the safest thing to do is migrate devices back to the C7 on a mesh (although it had it's own problems at times), or get a Z-wave stick and start moving them to the HA, or using it for any new devices. I'd prefer to keep them where they are given all the automations I've already setup, but as I said I don't think I can ever reliably add or remove new devices to the network. Also I'm not sure how wise it is to have 2 overlapping Z-wave networks.

I'll add some shots from my system for context.

Thanks for any tips or insight.

I don't see anything obvious, but i am far from a expert.

I always got in the habit of shutting down my hub and doing a full power cycle to fully reset the zwave radio. This seems to help the radio clear up anything that has built up over time.

I would look at adding some more repeaters if possible. You have a lot of non repeating devices and more repeating decices will only make the network stronger.

Allot of time when the network geta overwelemed it is caused by a over chatty device. You may want to monitor the logs and look for devices monopolizing the time. Maybe even look into setting up a sniffer.

Something that caught my attention were comments about the network not using security. It shouldn't be a problem to allow devices to use whatever security the device wants to use by default. My network has over 50 S2 devices and works fine. Try to avoid devices that use S0if possibke, but S2 shouldn't cause any issues.

2 Likes

Thanks for the tips. I did do a full reboot today when trying to pair that lock, but it didn't seem to help. Most of my devices are shades and don't have a security option. Only using security on locks that require it, hence the S2 and S0 (all those ones support). I tried to strategically position the few Z wave light switches I'm using to provide decent repeater coverage, but most devices seem happy communicating directly with the hub. I'll try watching logs, but honestly I think trying to pair anything new right now (including repeaters) just won't happen. I can't see how a bunch of non-repeaters would affect pairing a new device sitting right next to the hub, but Z-wave is weird, so I guess it's possible.

Where would I look into a sniffer? Not something I've heard much about.

Hi - I’m no expert either but I do have an about 75 zwave devices… as maverick58 noted the S0 has always been said to be bad for mesh.

I don’t think repeaters are a problem given that almost all your devices are direct path to the hub.

I do find it’s good practice whenever you are including a new device to do exclusion first, this may seem counterintuitive- but it does help.

And yes powering off everything not just reboot will help… I know this can be a chore with a lot of battery devices but maybe worth the effort.

Have you monitored the zwave logs to see how busy your network is.

A lot of times issues with zwave get chased back to a failing device spamming the network or devices configured in a way they are overloading the network.

Inclusions/exclusions are network intensive processes and as such if you have something hammering the network it tends to cause issues.

A zniffer is generally a stick or device with special firmware just to listen and record the network traffic. I actually don't have one, but there are a few discussions around it. That would allow you a way to see how busy the zwave airwaves are in your house.

1 Like

So i watched the logs for a couple hours and they don't show any activity outside of commands being sent and responses received, so 2-4 logs per device per action over a period of a few seconds as I would expect. Speeds around 40kbs, maybe about half the actions using repeaters, with times ranging from 8-174ms, taking longer when a bunch of commands are sent at once in automations (I do stagger them slightly). A handful of "true" route changes logged. I don't see any chatter or updates from my S0 locks except when I initiate it.

I read a bit about sniffing, and I'm still a little unclear if it's likely to uncover much else that's not logged. And as I said, I very rarely run into obvious network slow downs, which seems to be the most common cases where it helps- it just seems to be new inclusions that fail. Maybe I should test with a new device just in case it was actually the last few devices I tried to manage that have been faulty.

Thats not what was suggested. Did you shut down and unplug the hub from the wall for 30+ seconds? I assume you must have read that tip already somewhere but just want to confirm.

Thats the only way to power it, with the USB port. But what is the USB cord connected to?

Is your hub located neat your networking gear? Hubitat and Network Switch Interference


Looks like the z-wave details list screenshots you possibly took shortly after a reboot and/or possibly did not refresh the stats first?

1 Like

The point of watching the logs was to simply see if there was anything obvious spamming the zwave network like a switch, or plug spamming Power stats. It is very easy for such devices to overwhelm the network if the report settings are set to report to frequently.

That said the Zwave logs only show information that the Zwave radio can decipher. That doesn't mean the airwaves of the Zwave network aren't being consumed other ways. It has been found that devices sometimes fail in a way that spams the network and will jam up the zwave radio so it can't perform well. That is where the Zniffer comes in. With one of those setup, it just sits there and listens to the network. In many cases they can show misbehaving devices you can't see in the logs because you are capturing the data from the air and not just from what the hub can decode. This is when you can see a device passing malformed information, or just bad data.

It may also be good to look through the hub's logs and see if you have any occasions of zwave radio being busy.

Did you shut down and unplug the hub from the wall for 30+ seconds? I assume you must have read that tip already somewhere but just want to confirm.

Yes, that is what I meant by "full reboot."

Hub is not POE, power supply is on it's own brick. It is wired in by ethernet and is relatively close to a busy network switch and several other electrical device in the utility room. I could certainly try positioning it elsewhere if that is known to cause interference. Is having it ethernet wired an issue at times? I prefer that to wireless just for overall wifi congestion, and since I have a lot of api commands going between HE and HA.

I'll look at grabbing some refreshed and more "stale" z-wave stats.

Is there a specific place in the logs to look for evidence of zwave radio congestion, other than just the zwavelogs page? I've had it open for 18 hours now and there are only 366 entries, all seeming to correspond to expected automations.

I'll keep ordering a sniffer on the troubleshoot list if nothing else pans out.

Cheers.

There have been a few cases where people found either the rack itself or the other equipment was causing problems.

Ethernet is preferred.

For the power supply, just make sure it is high quality and ideally 2A or greater. Any sort of unstable voltage outputs seem to impact the z-wave performance on the C8. Have not heard of any issues with dedicated supply blocks though, typically issues are form PoE adaptors.

2 Likes

I'll try placing it elsewhere for a couple days and see how it does.

Meanwhile, here's the refreshed Z-wave info after just under 48 hours of uptime. Topology map is pretty much identical. Are the number of devices listing multiple route changes in that period an indication of connection issues? It's a pretty big mesh giving them lots of options, so I'm not sure if that's a big deal or not. That said still a limited number of repeaters.

Only other thing I can think of is I do have that one seasonal power plug (device 92) that is not plugged in and wasn't excluded. Although the joining struggles were still there when it was plugged in.

The higher routes changes are normal for FLIRS devices (non sleepy battery devices like locks and shades).

I would plug that outlet in somewhere and get it talking to the hub again. Due to your limited number of repeaters it could be causing some pairing issues. Not sure what it is but some people have no issues with a bunch of dead/ghost nodes and then other times just one will totally break all pairing.

1 Like

Out of curiousity what gen Zwave stuff is your gear. Is it all Zwave Plus 500,700,800, 800 LR capable. What brands

Sorry for the delay. I didn't get a chance to check the Zwave version of everything. Where would I find it? Most stuff was ordered within the last 3 years for what it's worth. Smartwings blinds and a few older gen inovelli switches (red and black).

So after moving the hub to a new location well away from the network junction boxes that is was previously mounted near, I was able to add one of my trouble locks without any issue, so it may well be that interference was behind the issue, despite fairly normal daily operation. I'll keep an eye on it, but thanks for all the troubleshooting tips!