[2.3.9.148] Help understanding how Z-wave Repair works. Help with Current z-wave issues. [C-8 Pro]

Even though I am running a Beta version, I am posting this here because I don't think this is beta related. If I should post this to Beta instead, please let me know

After having some issues with a couple of Z-wave locks (this may be beta related) I spent the better part of the last 2 days trouble shooting, removing the locks, trying to eliminate ghost devices and more. Ultimately, I messed things up enough that I had to restore the hub and z-wave database to a point before I started. This did eventually fix the problem, and I got the ghosts removed, but it left me with a z-wave mesh that had no recognized neighbors. None of the devices had known neighbors and the topology was 100% red. I did numerous shutdowns, unplugs for 30 seconds, and restart procedures. I waited for a couple of hours to see if it would restore itself at all. I tried running a global z-wave repair and initially all nodes failed. upon running it again no nodes were found in Group 1, then in Group 2 it started running through all the nodes (almost) successfully. Things started looking better but I still have a problem. After leaving the hub alone all night (12 hours) I get the topology below. While most devices can communicate with the hub, the hub can't communicate with many of the devices. You see why in the following topology:

As you can see, Column 1 ( which is the hub) most of the devices can see the hub except the sleepy ones. However in Row 1 the hub cannot see most of the devices.

My questions are:

  1. general question about repair. What are the Groups (1-6) in repair? What is their purpose? How do groups get selected by the Hub? I have done nothing to set Groups in Z-wave and don't know how I might be able to do so. Before this issue, when I ran global repair, nearly all devices were shown in Group 1. Now none. They are all in group 2. What does this mean? Even more interesting, when I do individual device repairs, every one of the devices show being repaired in group 1, not 2, but when doing global repair they only appear in group 2.

  2. How do I get the Hub (device 1) to start recognizing neighbors? I can't do a repair on device 1.

Let's start with these questions, but I may have more questions after these answers. Can anyone help me understand rep[air and repair groups better?

LJ

The z-wave radio restore must wipe the routes, similar to a migration. I think it can take a few days for everything to get fully sorted out. I would just wait a little while longer.

2 Likes

how long might I expect?

also, are you able to answer my question about repair groups? what are they and what do they mean? why would they appear in group 1 for individual repars but in group 2 for global repairs?

Not sure about the groups. I know there are extra "helper" nodes the hub creates, typically nodes 2-5. Possibly the groups has something to do with that? Maybe the hub is talking to the devices via the helper node 2 which might be why it is not showing on the topology.

I know there was one other case in the past where the hub was showing like that on the topology but I do not remember what the resolution was.

1 Like

Yes, I remember when nodes 2-5 were reserved for helper nodes. but that changed some time ago such that the helper nodes don't show up as reserved anymore, but now nodes 2-5 show as devices.

Maybe @bobbyD or someone else can find an answer about the group questions and what they mean. I am suspecting that the group 1 vs. group 2 thing is why my hub cannot communicate with most of the devices and why topology doesn't show row 1. I just don't get why the difference between individual device repair showing in group 1 and global repair showing in group 2.

LJ

??? They do?
I still have those virtual nodes on all my hubs, they just are hidden from the z-wave details.

We need to find the old thread because they had the exact same problem, hub was not talking to devices, but devices could send stuff to the hub. There was possibly a fix in there.

Also, maybe, if you restore your cloud backup again it will put it back to normal? Be more patient this time and let it do its thing before you start trying to remove ghosts with PC Controller, etc...

I have devices on nodes 2 and 4. I have for a long time.

Should that be a concern? Should I remove and re-pair them? Those two seem to work fine.

LJ

LJ

If you migrated from a C5 once upon a time then the new zwave chip would have created virtual nodes on different node IDs. The C5 and earlier did not have the helper nodes. You can see them in PC Controller listed as "Virtual Node".

I did migrate from C5 to C7 to C8 Pro over the years. Looking at PC Controller, I have 12 virtual nodes each are grouped sequentially in different groups of 4. Hmmm. What does that mean?

LJ

My main hub has two groups. Supposedly it means for whatever reason the controller decided to make new ones at some point, possibly from some sort of corruption on the original C7 zwave firmware which had issues. I was told not to mess with them.

Good advice. I try not to mess with things I don't fully understand unless have to. Thus my question about the repair Groups.

LJ

I found the old thread with the same issue with the topology.

The solution appears that it was interference, moving the hub fixed the issues, not sure if it fixed the topology. Also make sure you antennas are tight, you could also shut down the hub and try swapping them.

1 Like

I've tried relocating the hub per the post you shared. It is now on a shelf immediately above its old position and about 6 ft higher. There is nothing electronic around it besides a couple of Hue bulbs (ZigBee) about 24 in. away. I also switched the antennas just for kicks. I am skeptical this will fix things simply because even if these c8 Pros are more sensitive, I would expect more than 6 out of 66 devices to connect in row 1 (the nearby ones), but who can say for sure?

I did a shutdown, power down, wait, power up. I am not going to touch the D^#n thing for 48 hours. (that will be hard :face_with_spiral_eyes:) Let's see if it calms down. Things are working more reliably this morning, though there still are some delays and hiccups. Just not as many. The topology has not changed. Do you think it would help for me to operate the devices from the device and from the hub periodically during the 48-hour period to stimulate communication and possibly encourage finding new routing?

One question: Are you aware of any way to eliminate/erase only the neighbors/routing info in the mesh but not the devices, thereby forcing a complete new mesh build?

LJ

PS: Can you tell me what node is involved when the Z-wave Logs show dev: with no device number? is that the hub? the virtual devices (helpers)?

Yes

I think a cloud backup and restore does this (isn't that how you got to this point?)

Usually it is from the first message it sees from a device (while logs are open), for some reason it does not match up correctly until the next one, bug in the z-wave logs I guess.

1 Like

I will certainly wait to see how the 48 hour abstinence period works. Thanks,

LJ

Ok, after 2+ days of abstinence, here is the current state of my Z-wave.

  1. The performance of the mesh is improved. I have no noticeable performance issues.

  2. That being said, the first row of the topology has only gained about 3 or 4 devices. All the hub messages are routed through 4-5 other nodes, even though there are closer line-of-sight devices. Perhaps row one will continue to build out over time.

  3. all orphaned devices and z-wave nodes have been removed for two days.

  4. the last and most annoying issue is that I can't add back the problem lock that started all this in the first place. No matter how close or far away from the hub I try to pair it, I get the same result. When trying to pair, the lock reads that it paired successfully, and HE says it found a device, and it is initializing, but it never completes the initialization. After rebooting HE, I sometimes see a new z-wave node in z-wave settings, but it reads either an Allegiant (Schlage) device class or an AEON device class, or there is no device class manufacturer at all. It only sometimes creates a device with a generic name. Other times, no device is created at all. I once got it to pair with an AEON type and a generic device. I then manually changed the device driver to Generic Z-wave Lock and was able to get the lock to lock and unlock unreliably, but it couldn't take codes. The lock would not at all respond to the stock Schlage FE599 driver. After rebooting, all these erroneous pairings would exclude normally. Any ideas on this one? I am still searching. I have reset the lock dozens of times, so that is not the case.

  5. one other question: in the Z-wave settings where you select the radio frequency, I have a choice of US or US-LR. I have no LR devices, so I tried to set that back to non-LR. It immediately goes back to LR. I know the frequencies are the same, but I have assumed there is different functionality somewhere since there are different settings. Is that true? Why are there two settings if only one is needed? If they are genuinely different, why can't I select non-LR and have it stay?

Thanks again

LJ

My guess here is that both options are necessary to accommodate the entire family of Hubitat hubs, since only the 8/8P can do LR.

Perhaps it's too dicey to build dynamic logic into what options are shown in that menu, but it would be helpful to perhaps add something like a "(8/8 Pro only)" blurb after the LR entry.

1 Like

That makes total sense. Thanks for that insight.

LJ

OK, I think I have determined that the Problem I am having with my Schlage FE599 lock is a c8/C8Pro problem. It is likely a problem with the 800 chipset. I can pair the lock without issue on the C5 and my C7. It pairs right away and immediately recognizes the built-in FE599 driver. It simply doesn't recognize the device manufacturer info messaging on the 800. As I think about it, the flakiness started sometime after I migrated to the C8-Pro from the C7 in January or some subsequent update.

Might this be a beta issue that started with the LR support? A lot of strangeness started in the Z-wave when 2.3.9 rolled out.

Who should I report this issue to? Bobby, Bryan, or Victor? I think I will start with @bobbyD for his advice.

LJ