Zwave Zombie Node

I have a Hubitat C7 hub and it seems like the Z-wave stuff lately is ignoring rules.
For example I have rules to turn off a Zwave outlet controlled light that sometimes fails.
I also have a number of lights in a Scene that I usually turn on a certain time and some of the Zwave Plus devices do not turn on while others in the scene do.

The logs in most cases don't even mention the devices that have failed even though they are in the group and/or scene. I went into the device status and they seemed correct - off when they are off and on when they are on.

One of the suggestions has been to look for Zombie Nodes.

I found one Z-wave node. Could this be an issue? I don't see how to delete it.

That could definitively be the root cause of the issue. To remove the ghost node, you will need to:

  1. Ensure that the device it refers to is off (most likely the next device in the list, but that isn’t guaranteed…)
  2. Press Refresh until you have an option to remove / delete the node
  3. Remove the node (multiple retries might be required)

Thanks I tried that and get the Remove option but the node just stays in a Pending mode. I have tried delete about 5 times but it just stays in Pending.

A little further looking shows a Indented device from my Bond Hub which is a Fan and it has a light associated with it. That does not show up in the Z-Wave list but does show up in the Device list.
However in the Device list the Remove option is greyed out.

The Bond Hub app does not show the light as a pick option only the fan unit itself.

I am thinking that is the Zombie Device I see in the Z-Wave list. How do I delete it as it just stays in Pending and does not go away.

The "remove" button on the devices screen I think does something different. From what I've read it deletes the device from the Hubitat's database but it does not delete the device from the zwave radio. What you're trying to do by removing the ghost is get it out of the zwave radio so the mesh doesn't think it's "alive" and try to repeat through it. Successful ghost removal sometimes seems to be as much art as science and you'll find a LOT of commentary on the subject.

The real trick here as @Sebastien mentioned is to ensure that the actual physical device behind the ghost is disconnected from power. I'm probably way over simplifying what actually happens but I understand it sort of like this:

When the hub gets the command to remove the device it first pings the device to make sure it's really failed. If the device is still powered it will respond to the ping and the hub won't remove it. If it's an outlet or a switch you have to cut power to the relevant circuit for the hub to realize the ghost is really gone. Turning off a switch that has a ghost device attached to it won't work, because the switch itself still has power and will respond to a zwave ping. You'll often refer to folks "air gapping" a device to ensure it is not powered. If it's a battery operated device pulling the battery out works.

The difficulty is knowing which device is the "real" device behind the ghost. A common source of ghosts is a failed zwave inclusion, so we tend to look at the next device after the ghost device, assuming that device failed inclusion the first time but succeeded the second. By that logic the ghost is just before the physical device on the zwave details page. That's definitely not a conclusive identification, of course, but it's a good place to start as @Sebastien mentioned. If it's not the last device you successfully included it's probably not too far up the list.

Once I have ensured the ghosted device has been disconnected from power I have found that repeatedly hitting "refresh" and "repair" does the trick. There are alternative methods that can be used if that fails, but they're a little more dramatic, so hopefully the simple solution will work for you.

2 Likes

That is correct. Hitting remove on the device list wil put the hub in exclusion mode if the device was a z-wave device. Once it recognizes it is no longer in the z-wave database, it will remove it. However, if after a short timeout period, it sees no change, it will offer a Force Remove option that will delete the device but leave it in the Z-wave database. This could create a new ghost node…

Sometimes one needs to keep trying and sometimes it will eventually work. You may need to turn off some breakers…

Check your logs and see if there is an associated message. It can help diagnose the issue. Go ahead and post a screenshot if you can.

There are times where the ghost can not be removed by the hub and a USB z-wave stick must be used. If you find that you need to go that route, the write-up linked in the following post might be useful:

I am not familiar with the device in question, but as far as I can remember, I have only seen grayed out removal buttons on child devices. In those cases, the parent must be removed.

That said, removing the device from the device screen will not help with the removal of the ghost. It would be best to leave that one alone.

2 Likes

Thanks for the suggestions. This is still a work in progress. I have went around to most of the circuit breakers. one or two at a time, turned them off and then went into the zwave screen and clicked refresh/remove a couple of times. No success.

I have not power cycled the Hubitat yet.

What I see in the logs are instances of the following message. 36 is the node in question.

sys:12021-09-12 09:23:52.484 am Failed node 36 remove status: 0 SDK failure node is no longer in failed node list

The Zwave status says node 36 is a SPECIFIC_TYPE_POWER_SWITCH_MULTILEVEL
So I have been focusing on the dimmer switch circuits and not things like Bond Hubs or outlets.

Rebooting (and removing power) would be a good idea. Start by shutting it down via settings, then pull power from the plug (pulling power from the hub is not recommended as it can weaken the connector on the board). Leave it unpowered for 30-60 seconds then restart it. This will properly reboot the z-wave radio.

@bcopeland, I can’t recall - does this message mean that it is still seeing the device?

Shutdown Hub / Pulled USB at power plug / Waited 60 secs / Reinserted Power / Waited for boot.

Z-wave Device looks the same but says OK and shows Discover button.
I hit Refresh and OK goes to pending and Discover button goes away.
Refresh and Remove buttons don't change anything and the same log message:

Failed node 36 remove status: 0 SDK failure node is no longer in failed node list

Also as having a log screen up simultaneously with Z-Wave screens shows as soon as I hit refresh I get a log message:

Z-Wave Network responded with Busy message.

That may be an expected message I don't know. Then when I hit the Remove I get the not in failed node list.

I wish there was a way to get more detail on what it thinks the node is other than the device class.

Unfortunately that is a sign that the SDK has database corruption.. 2.2.9 is getting a Z-Wave SDK update that should prevent future corruptions from happening.

1 Like

2.2.9 is getting a Z-Wave SDK update that should prevent future corruptions from happening.

I have no reason to believe my SDK database is corrupt. However:

  1. Is there any way to validate the SDK database?
  2. Any way to fix the SDK database if it is corrupted?
  3. Will it be possible to install the updated SDK if the SDK database is corrupt?
  4. Will the updated SDK fix a corrupted SDK database as a part of the installation process?
1 Like

No, it is a proprietary database format, we only really know when the conversion happens there is logging that confirms it.

Unfortunately not yet.. I’m hoping with the new database format being standard, rather than proprietary, we may be able to do more.

Yes

No.. Nodes that are already corrupt in the database will still be corrupt in the new database.. But .. It should prevent any new corruption.

1 Like

Will corrupted nodes be identified? If so, will we be able to delete just those bad nodes and re-pair those devices, or will it be necessary to wipe everything and start over?

And, how does Hub Protect handle migration of a corrupted SDK database?

1 Like

So I thought I would try to go with the Z-Stick approach to delete the zombie node. I have a Homeseer Z-Stick I used to flash upgraded firmware last year but it doesn't have all the other functions like deleting nodes that the SiLabs Simplicity Studio has. I went to their website and tried to create an account to download their studio software but it never sends me the verification email. Is there some other place to get it or better alternative Z-Stick software to use?

As I recall even once you get the software installed it asks you to register it with your user name and pw... so an alternative download link won't work.

Spam filter issue?

I have checked through my spam filters on my mac as well as the mail server and even tried a different email address as well but nada.

Not sure how to help you there. Maybe the issue is on the SiLabs side? However here's a potential challenge for you. While Simplicity Studio itself runs on a Mac, the PC controller tool you need (which runs inside Simplicity Studio) does NOT run on a Mac. It does run on VMWare Fusion, though.

Let me try a quick registration with an alternative email and see what happens.

I tried registering with a gmail account and the email came through almost immediately... ?

I have Parallels so I can run it under Windows if needed.

I have a PoBox email address but I also have a gmail account as well. I just tried that and checked the gmail spam filter as well and nothing. I suppose I can try the registration in Chrome vs Safari and see if that makes a difference.

I used Safari... very strange. I don't think it's related to using a Mac since I did (plus that would be just weird... it's email)

I run Fusion so that I know works. But I don't see why Parallels wouldn't work just fine as well.