HELP! All device fallen off the Zigbee mesh at once - Oz Hubitat user

I don't know how much extra would have to be onboard/in firmware for a future HE Hub to better help diagnosis these kind of things...but I'd be all in for it.

1 Like

Some kind of "getChildandRouteInfo parser/analyzer" app might be feasible to flag major symptoms like empty neighbor tables, majority of neighbors with low LQI or zero outCost, or too many neighbor changes in a given interval (that would require analyzing and comparing multiple snapshots of the page). Could provide a good starting point for troubleshooting purposes, though some 'big picture' knowledge would still be necessary to make use of it.

5 Likes

One other thing (bit of a long shot) - could you also try deleting the devices from HE and also re-pairing the devices one at a time allowing some time between pairing to ensure that that particular device isn't at fault? It seems strange that the ones connected direct to HE would fall off also, these should remain connected regardless of whether the rest of the mesh dies unless the radio/HE has developed a fault or there is a lot of intererence. Your logs might give a clue to what's going on.

Another thing to try would be to turn off your channel 11 wifi and use zigbee channel 25. (or select whichever channels has the least level of interference from your neighbours). Don't forget also to turn off 40Mhz channel bonding if it's on.

Would be nice to have these tools like network mapping built in. It would eliminate the need to fire up the xbee and XCTU!

2 Likes

Thanks @Tony for all your inputs. This is really helpful. I never knew this bit. Please see below. I have gone through the getChildAndRouteInfo. I added couple of switches & they initially had outcost 1 & now they do not work & as suggested have outcost 0 .

Parent child parameters
EzspGetParentChildParametersResponse [childCount=1, parentEui64=0000000000000000, parentNodeId=65535]

Child Data
child:[Pantry Motion Sensor, EEE1, type:EMBER_SLEEPY_END_DEVICE]

Neighbor Table Entry
[Upstairs Common Area, 05B7], LQI:255, age:3, inCost:1, outCost:0
[Backyard Main Switch, 48D8], LQI:255, age:4, inCost:1, outCost:0
[Study Switch, 757B], LQI:254, age:3, inCost:1, outCost:0
[Formal Switch, 87A7], LQI:255, age:3, inCost:1, outCost:0
[null, AB4D], LQI:253, age:4, inCost:3, outCost:0
[Family Area Switch, ADA7], LQI:254, age:4, inCost:1, outCost:0
[Upstairs Bathroom, FEFF], LQI:255, age:4, inCost:1, outCost:0

Route Table Entry
status:In Discovery, age:0, routeRecordState:0, concentratorType:None, [Upstairs Bathroom, FEFF] via [null, 0000]
status:Unused
status:Unused
status:Unused
status:Unused
status:Unused
status:Unused
status:Unused
status:Unused
status:Unused
status:Unused
status:Unused
status:Unused
status:Unused
status:Unused
status:Unused

Route Table entries do have some entries that appear /disappear as if trying to make a connection with the hub but then they do not have a consistent connection.

LQI seems to be good and above 200 for all devices.

Thanks for your reply, I did try to turn off all my Access points & pair devices & see if it stayed connected, but unfortunately that did not work.

I have very limited access to my Access Point (TENDA NOVA MW3), which works from the App so I cannot select which Channel they run on or change the frequency.

I did a bit more troubleshooting, what I did was.

  1. Remove all devices
  2. Add just 1 IKEA Trafri RGB bulb
  3. Added a 1 Gang Switch, the Ikea globe & switch co-existed for quite some time without any issues.
  4. I introduced my bedroom 2 gang switch, as soon as i added the second switch the first 1 gang switch stopped working & the Outcost for 1 gang & 2 gang changed from 1 to 0.
  5. Between all this the Ikea RGB globe still stayed connected & had to issues or conflicts.
  6. I then removed both the switches again & re-added the 2 Gang switch and it stayed connected.
    7 After 15 minutes I added the 1 gang switch & it broke all switches again

I m running out of ideas at this point.

@mike.maxwell can you check his engineering logs? This is really strange

1 Like

Try restarting all your wifi devices. Strang as it sounds i have a few wifi cameras that have completely shutdown my zigbee network. Once restarted my zigbee devices all came back.

1 Like

The reception part (LQI's, inCost figures) shows that the hub appears to be receiving OK; it's the transmission part that seems to be an issue. outCosts get reported as 0 after a number of 15
sec. reporting intervals have elapsed without remote devices sending a valid link status report (which should contain their assessment of reception quality at the remote end of the link).

For a viable link, instead of outCost=0 you want to see an number in the 1-7 range, derived from that remote device's measure of its own LQI. Zigbee's routing strategy then attempts to choose the lowest cost path when routing messages.

Looks like some device had an issue pairing at some point (the null entry in the neighbor table might be a remnant from when it initially joined; that won't cause any issues and will get purged out eventually).

Not sure if this is feasible for you but I'd be curious to see if there is any subset of routing devices that would result in a stable mesh... turn all of them off (not necessary to remove them, just power them off) and turn on one... see if its entry in the neighbor table looks viable. After a while, turn on another, etc. Just in case a single bad device could be causing an issue. Otherwise, I'd have to suspect some transmission issue on the hub side (just going by all the zero outCost figures).

That part is normal; route table entries 'age out' when the path hasn't been in recent use and what is shown there is basically a 'cache' of route discovery requests that the hub makes when it wants to establish a path to a device. Those aren't meant to be static.

Good idea... also make sure the hub isn't very close to your wifi router. Regardless of channel, the sidelobe interference can cause problems if the hub and wifi router are closer than a meter or so. I have seen this phenomenon first hand with a buddy's SmartThings setup (even though my own arrangement is pretty close to violating this advice).

2 Likes

This seems to be a problem with the Nue switches, I have tried adding an Ikea bulb after & a Sonoff Motion Sensor & they both still work. I just now added a 4 gang switch & it just stayed on for a minute & turned off.

Parent child parameters
EzspGetParentChildParametersResponse [childCount=0, parentEui64=0000000000000000, parentNodeId=65535]

Child Data

Neighbor Table Entry
[Bedroom Lamp, 649F], LQI:254, age:3, inCost:1, outCost:1
[null, 757B], LQI:255, age:3, inCost:1, outCost:0
[Family Switch, 8C8B], LQI:255, age:4, inCost:1, outCost:0
[null, F0DA], LQI:255, age:3, inCost:1, outCost:0

I have made sure that the hub is at least 1 meter away from my Access point. The Hub was working in its current place for about a year.

I have tried turning off all Wi-fi access points & still no luck.

All other Nue switches are hard-wired to the wall, The best I can do is to turn off Mains for say 2-5 minutes & turn the mains on to see if resets all of them

Ah... someone else had issues with their Nue devices a while back. Check out this thread: Nue Zigbee Dimmer falling off the mesh, errors in the log - Get Help / Devices - Hubitat

Not sure if this problem was ever resolved, unfortunately.

Thanks for sharing this, but doesn't look like they were able to get anywhere with the dimmer issue. I have called the 3A guys, going in this arvo with the I.C. He suggested that a firmware update may be able to fix this.

I will keep you guys posted.

2 Likes

Had to Google that one. I haven't been to Oz since 1991, have forgotten lots.. lol.

2 Likes

Ok, I have all my switches working Finally!!!

  1. Took the switches to 3A Smart home.
  2. The guy plugged in all the switches & flashed them with a device he had with him.

I got them back hoping they paired/stayed connected. I put them all back on & paired them. They all paired successfully & now stay connected.

Neighbor Table entry looks a lot better now!!!

Neighbor Table Entry
[Upstairs Common Area Switch, 0AAD], LQI:254, age:3, inCost:1, outCost:1
[Family Switch, 0D89], LQI:254, age:3, inCost:1, outCost:1
[Upstairs Bathroom Switch, 227C], LQI:255, age:3, inCost:1, outCost:1
[Formal Switch, 2AD0], LQI:255, age:3, inCost:1, outCost:1
[Bedroom Switch, 2DB4], LQI:254, age:4, inCost:1, outCost:1
[Study Switch, 634A], LQI:255, age:4, inCost:1, outCost:3
[Bedroom Lamp, 649F], LQI:255, age:3, inCost:1, outCost:1
[null, 757B], LQI:255, age:3, inCost:1, outCost:1
[Kids Bedroom Switch, 7D3A], LQI:255, age:3, inCost:1, outCost:1
[Backyard Main Switch, ABB3], LQI:255, age:3, inCost:1, outCost:3
[Pantry Switch, D3F1], LQI:255, age:3, inCost:1, outCost:3
[Kids Bedroom lamp, F6F5], LQI:255, age:4, inCost:1, outCost:1

I thank one n all who provided all their knowledge and assisted me in troubleshooting / resolving.

I think this post should be pinned somewhere so all the people having issues with Nue Switch can see this.

Thanks

3 Likes

Hmm, you should ask if they can make the firmware file available to you. @mike.maxwell could drop it on hubitat

2 Likes

If an incompatible device won't stay connected to the mesh indefinitely, not likely it will be able to stay connected long enough to do an OTA firmware upgrade. Definitely merits a heads up on the 3A or nue websites, however.

3 Likes

I did suggest them to do so & he said its not yet compatible with Hubitat. Not sure what he meant by that. I did tell him that this way of managing is not feesable. Imagine someone who is in Sydney or elsewhere will need to remove / put the dumb switches back on & post them to be fixed.

1 Like

I’m experiencing the same issues and reached out to 3a and got the following response:

The dimmer switch does not have the issue that requires to upgrade the firmware like the 1, 2,3 and 4 gang switches. I will send a sample to Hubitat- Mike to test it with the dimmer diver next week.

That was in September though so I’m not holding my breath. I’ve followed up but am not receiving any response.