Daily Z-wave crashes

I have a C-7 since a week ago and have been migrating my devices to HE from Vera. I'm at 66 z-wave devices in my system plus a few non-z-wave items connected via apps and/or APIs such as a Netatmo Weather station, Netatmo Cameras, Doorbird doorbell, etc, etc.

Every day, it’s as if my whole z-wave mesh disappears and becomes irresponsive. All devices stop responding to commands and if I try to repair the mesh, every single one shows unreachable. A reboot does not cure this, the only cure is waiting indefinitely or a full shutdown process.

The devices connected via API respond fine. The logs show me no clue as to what might be happening.

The only thing I can think of is that this was not happening with 2.2.9, it started after 2.3.0 and since this morning I’m in 2.3.0.121 from 2.3.0.120 but this has not changed anything, the z-wave crash happened just now. Tried a reboot, and the crash remained but a shutdown recovered my mesh.

Initially I though I had RM rules getting stuck but I've eliminated that possibility already.

Any help on how to troubleshoot this?

There have been no Zwave changes in recent firmwares. So the updates are likely not the issue, although they may have uncovered a hidden problem.

Please post screenshots of your Zwave Details page. Maybe someone will be able to spot something wrong.

Also, are there any errors/warnings in the Logs tab? While you are in Logs tab, go over to Apps stats and Device stats and see if anything shows excessive use. (I expand the double arrows near the Hub Uptime and select ALL stats).

1 Like

Have you had any failed pairings? As said above, stop what you are doing and post your z-wave details page. Do you have any zooz sensors in the mix?

2 Likes

I have a similar experience, starting about 2-3 weeks ago. I've had a Hubitat for at least 18 months, migrated from a ST (had it for probably 3 years). Have maybe 50 devices, mix of hard powered and battery powered. A hard shutdown will get things working again, sometimes for 15 minutes and sometimes for 24-72 hours. Eventually all Zwave commands will stop being sent. Soft reset does nothing. Debug will show the button push if I trigger from the web UI but no Zwave command sent (or received). A Zwave repair will show this:

Every node failed. Before (i.e. 6-8 weeks ago) I was getting an entertaining result: one time half of the nodes will fail. I'd run repair again 1 hour later and literally the other half of nodes would show as failed.

I've found Zwave to be a pretty pathetic protocol in terms of reliability, troubleshooting, etc. but this is really taking it to a new level. At this point I've removed pretty much every app that isn't super basic. No apps have been added in at least 6 months. I did install some Zooz LED controllers but they aren't flooding the logs.

Different issue, not sure if it related: I have a few Homeseer switches. I have one that is not joined. I run exclusion, exclude it. Factory reset. Try to add it. It adds but adds as a different Homeseer switch that was already in the network. Logs clearly show the old switch's activities but now the new activities.

Anyone have any ideas? Otherwise I can't seem to find out what is causing this issue.

Welcome to Hubitat forums!

Generally you shouldn't be running a full Zwave repair. It is fairly common to see failed nodes when doing a full repair for whatever reason. IF you for some reason need to do a repair (and you really shouldn't need to do one with Zwave plus devices) do an individual node repair, and wait a while before doing another.

My advice would be the same as above:

Thx for your replies. Here's my screenshot, nothing looks unusual to me. I don't have the time to fully investigate the logs now, only later on today, am about to start a day-long Zoom but will do and report back.

You have at least 2 ghost nodes in your mesh.. #43, and #56.

Take a look at this post. It's saved me ot of headaches. I've seen just one node on a much smaller system have a simar effect as you are seeing.

1 Like

Hi lcw731, #43 (decimal not hex?) is my main door lock and #56 (decimal not hex?) is a light and both respond fine to Z-wave commands when the network is functioning normally. What leads you to say they are ghosts?

They're not showing any route in the last column.

It is also generally advised to never have S0 as security. The devices should be paired with security set to none as S0 is "chatty". Unfortunately if they are ZW500 devices they will automatically join with S0. The only way to get around that is to pair a USB stick to the hub as a secondary controller and use that to include those devices.

The Z wave stick (I use an Aeotec) along with Silabs PC Controller app is invaluable for sorting out these issues.

1 Like

Ah, 46 and 53. Those are battery operated, would that be normal and only show up when they wake up? Is that possible?

I'm no expert on that but all of my devices including the battery ones show a route. My issues disappeared once I did away with S0

1 Like

How did you manage to get rid of S0? Is there a way to force non-secure inclusions in Hubitat? It seems all my Fibaro devices do that by default.

Yes if you see my post above - you get a USB Z Wave stick.

When you want to include a ZW500 device, you include the stick first as a secondary controller. When you fire up the PC Controller app, you do the inclusion with that instead of Hubitat and it is included (in Hubitat) with security set to "none". There are loads of posts regarding this as it's a common issue. I too have multiple Fibaro devices (Dimmer 2/Switch 2) where I've needed to use that process. If you include a newer device (ZW700) Hubitat gives you a pop up box regarding security and you can just click "skip" to include without security. Unfortunately there is nothing the Hubitat team can do about this as it's a Silabs Z Wave implementation that forces any device to join at S0 if available.

The post linked by @lcw731 gives the procedure in full and that's what I followed. The first one was a bit of a pain but now even if I've just a single device to do its pretty easy. Unfortunately you cannot change the security. You will need to exclude then include each of the S0 devices.

1 Like

Ok, I woke up those routeless devices and they are now showing routes.

Regarding S0, if the network stays stable as is, I'll keep it this way, should it go south again, I will follow your advice and get a USB stick to see if getting rid of S0 stabilises things.

Which makes me wonder if my issues and instability whilst in the Vera world were due to S0 as well...

I would also get rid of s0 (I'd get rid of any security except locks and garage door openers) The issue is s0 chattiness can overwhelm a mesh and cause exactly the symptoms you're having.

3 Likes

This is the issue with your mesh. Way too many S0 devices. And some of them appear to be power reporting devices. So not only will they frequently send messages on the z-wave mesh, but they will generate three times the amount of traffic as devices paired without S0. This is guaranteed to bog your mesh down.

4 Likes

I have already removed the power reporting messages as I'm not even using them. I will keep this post updated with whether the issue shows up to be recurrent and will then move into eliminating S0 from my mesh, except from the S0 door strike controller and the two S2 locks.

1 Like

BTW, Aeotec has already out a USB stick with the 700 chip. It retails for the same price, I might go for that one instead of the more commonly available Gen5+.

If you do I believe there's a setting in the PC Controller software that ensures that when you include a device, it includes it without security. The last thing you want is to buy a stick and find it still includes at S0 as you'd be no further forward.

1 Like

@laalves

With so many line-powered S0 devices, there is also the real possibility that they act as repeaters for other devices in your mesh, which, I think, would exacerbate the issue.

2 Likes