Zigbee Network Issues

I'm not sure how often the route table info is updated for display purposes in the log, but for the Ember Zigbee stack, the 'age' count represents the number of 16 second intervals that have elapsed since the last link status message was received (a Zigbee router usually sends a link status message every 16 seconds; when the neighbor table entry's age count becomes greater than 6, this neighbor entry is considered 'stale' and is eligible to be 'evicted' from the table and its 'cost' gets set to zero, indicating 'unknown').

The end devices choose their 'best' parent router based on their assessment of the link (which may not be the closest, physically). You don't normally have a way to control this process.

6 Likes

Wow amazing explanation..... I got it Tony thanks!!

So there is no strength signal report here right?
... is there any other screen where I can see where is better to set physically a device?

The LQI figure can indicate signal strength/integrity of the link as a whole (in some implementations it takes into account the link error rate); numbers approaching 255 are better (basically if it is >200 it should be acceptable).

If you want to see the 'last hop' received signal strength metric (RSSI) from a device (keep in mind if an end device is not a child of the hub, you will be seeing the RSSI of the last router in the path's transmitter, not the end device's-- that's what the 'last hop' means), go to Settings > Zigbee Details > Zigbee Logging. But note that RSSI is a 'raw' measure of the RF energy and can include other networks in the same frequency band, as well as noise.

The more negative the RSSI number, the lower the measured signal strength. Most of my devices very close to the hub show RSSI's in the -40's / -50's or so; I don't have any apparent issues even when they are in the -80s although I'm sure some retries may be happening. When the RSSI gets to the -90's though, the signal is getting down to the noise level.

4 Likes

Hi guys
Today I faced another issue here.... My wife waked up first and she told me everything was ok with my HE (lights turning on automatically, etc...), but when I waked up nothing was wortking and I couldn't login in HE as the web interface never shows up. I checked and the hub was connected to my intranet (ping was ok).

So I checked Past Logs and I found this:

dev133 is a Generic Zigbee Motion Sensor on my kitchen (a samsung motion sensor)

I never saw that error before and after that nothing was working.... I have to remove the power manually and plug it again. Now all is ok.

What was that?
Why happened?
Don't want this happen when I be outside my home.... so how to avoid it?

Did you ever get an answer to this? I'm also curious as my peanut plugs report this.

A low ram concentrator is just what it sounds like. The coordinator device does not have enough memory to feel a full routing table so it only stores a partial table. When the coordinator needs to send a message to a device but does not know the route, it first has to send out a discover frame and wait for a routing response before it can actually send the message.

A high-ram concentrator is the opposite. It maintains a full routing table to each end device. This also means that a high-ram concentrator has to frequently request route information from devices that act as routers so it keep its routing tables updated. Since the coordinator knows the routes to each device, whenever it needs to send a message it can do so without having to wait for route discovery. This method usually results in much quicker response times to devices.

8 Likes

So, from this I take it that "concentratorType:None" routers would always have to send a discovery frame before sending a message? If a "low ram" device is constantly changing routes on the routing table as in this thread, will that negatively impact the coordinator, since it has to keep requesting routes? I only ask because others, as well as myself, have noticed this happening with the Peanut plugs, and since rejoining them to my "lights" hub I have seen a consistent increase in latency using Hub Watchdog. Maybe @mike.maxwell could take a stab at what is going on there.

1 Like

Interesting. Glad @srwhite threw some sunlight on this particular bit of shady area in my Zigbee knowledge!

@Ken_Fraleigh that's an interesting question!

I wonder. I think I'm presently only using one Peanut, but it might be worth a remove to see what happens...

Scott

3 Likes

For one thing, in EmberZNet there is no such thing. A concentrator is either low ram or high ram. None isn't an option. :slight_smile:

But low ram concentrators have to frequently send route discovery requests prior to sending a message. That's horrible for large networks because these are effectively broadcast messages which can start to degrade a network.

The concept of high/low ram only applies to concentrators. A concentrator (or coordinator) is the hub in the Hubitat world. Whether a coordinator is high ram (full route table) or low ram (partial route table), route discovery can still happen since Zigbee is a self-updating mesh protocol.

Hubs like ST and HE use MTORR (many-to-one-routing) where the only thing a device talks to is the hub, and vice versa... Devices do not exchange messages with other devices. This requires a high-ram concentrator in order to avoid the overhead of route discovery.

What's probably happening, and one of the reasons why many bulbs suck as routers, is because they do not have a lot of RAM and processing.. They are prone to buffer overruns, corrupted messages, and who knows, they could be blasing out a ton of route discover messages (which routing devices can do) causing the network to remain in a constant state of flux.

I would remove the bulbs and see if the peanuts stabilize, then try the same in reverse by adding the bulbs back and taking the peanuts out. It won't isolate a bad device, but will potentially help narrow it down to a class of devices.

6 Likes

I appreciate you taking the time to explain that so well, but I have one question; When it says

[quote="Ken_Fraleigh, post:103, topic:15240"]
concentratorType:None
[/quote] , what does that mean if it isn’t a thing. I see that on bulbs, in-wall dimmers, and some plugs.

I would love to see, but this isn’t going to happen. My wife and kids would not be okay with hanging out in the dark. 60 bulbs are on the zigbee coordinator which only leaves the bedrooms, kitchen, and outdoors which have Hue, Hue, and zigbee switches respectively. I will just pitch the Peanuts into the drawer until next Christmas.

2 Likes

The difference is one is an API parameter, the other is a specification on a box... When you see concentrator type as being none on a spec sheet, it means that device is incapable of acting as a network coordinator (i.e. hub). It has nothing to do with the size and type of a routing table.

Hard to argue with that... Of course you can always give her your credit card and have her take the kids shopping!

4 Likes

:rofl::joy::rofl:
But seriously, I appreciate you taking the time to explain. I have wondered what the heck that meant since I discovered the getchildandrouteinfo page.
EDIT: Peanuts have gone to the drawer btw.

3 Likes

But I have no bulbs on my mesh except for Sengleds, which are end devices. And I had one peanut plug that was jumping around from router to router like crazy. So, is that a problem or not?

Then why are the peanut plugs listed as low ram concentrators and why do they keep jumping from router to router?

Also, the reason this whole topic got brought up in the first place is that I have a repeater, an Iris plug, which i paired in place. It is showing up in the neighbor table of the hub with an LQI less than 100. Now, there are 8 other repeaters in my network, 3 of which sit between the hub and this plug. One has direct line-of-site to this plug and that plug has an LQI of 253. So, my question is, why is this outlet still trying to connect directly to the hub rather than go through another repeater. An LQI of less than 100 is a very bad thing, right? So, why is it still trying to do that? Shouldn't it try to do a 2-hop route to the hub instead of tring to go directly to it? I mean, isn't that what a "mesh" network is supposed to be able to do?

I should clarify since I was speaking earlier in terms of the hub and its relation to the output of getChildAndRoute report.. In a general sense, a concentrator is as routing-capable device, which the peanut plugs are (so are bulbs).. That means those devices keep a partial routing table, which they have to in order to advertise themselves as routers to other devices..

FWIW.. I have never seen any device other than the coordinator be a high ram (i.e. full routing table) concentrator.

Why your devices are link hopping is anyones guess.. The one thing I could suggest looking at is 2.4GHz inteference with 802.11 devices. If you have an Xbee available you can use it as a spectrum analyzer.. Plug it in to a laptop and move it around the house to see if there might be any RF hot spots.

XBee (Pro and non-Pro) are High Ram concentrators:

Those are the only devices I've seen that are High Ram concentrators, though.

That would make sense since you can use an XBee is a coordinator..

1 Like

Just because your low-LQI Iris plug is appearing in the Neighbor Table, it does not mean that the hub is communicating with it directly (though the mere fact that it is in the Neighbor table means that it is, or has been, at least marginally capable of doing so). The Neighbor Table is not the same thing as the 'routing table' and does not show the route records (complete path of intervening routers) that the hub is actually using to communicate to this device (also note that the Route Table Entry listing is also not the complete routing table, but just the next hop recently used to reach a device). When many-to-one source routing is being used, the route records stored by the hub will contain a reverse source route (sent along with the actual outbound Zigbee message to each successive next hop)-- this does specify the actual path being used. This path will have been determined as the 'least cost'/best path via LQI and link cost exchanges that have occurred during prior route discovery in response a route request network command sent by the hub. The hub will store these route records for each device in the network in the routing table. Again, the route records are not visible through the ChildandRouteInfo page. However, if the device is indeed going through an intervening router, its next hop will at some point show up in a Route Table Entry, shown reachable as a 'via' through the intermediate router. This would be the indication that your problematic router is reachable by a router with a lower-cost/better path to the hub.

For example in my network the Neighbor Table is full, with 16 routers listed. Two of them, Iris Plugs H3 and H8, both appear in the Neighbor Table, with LQI 255 and 254 respectively. However I can see that a Route Table entry for H8 appears as follows:

[Living Room Iris Smart Plug H8, 152F] via [Living Room Xbox Iris Smart Plug H3, DD56]

So I know that even though both appear in the Neighbor Table (and each with good LQI), messages from the hub to plug H8 are being routed using H3 as the next hop as part of a two-hop route. Why this less efficient two hop path is being used when the hub is evidently capable of communicating with H8 directly is another subject. I'll chalk it up to the dynamic nature of the network, imperfect optimizations of the link path cost, or whatever. It likely makes no perceptible difference in latency so I am not stressing about it (H3 is an Iris plug literally two feet from the hub; H8 is diagonally across the room about 25 feet further away).

1 Like

Then why do the iris plugs list themselves as connector type: none? That doesn't make a whole lot of sense.

And 8.02.11 interference wouldn't make any sense either. That would just make the case for this plug further away from the hub to use one of the avilable repeaters instead of trying to connect directly to the hub. If i was expecting a direct hub connection and it was routing through another repeater, then interference could be an explanation. But how could interference result in a device not picking the better route to the hub but trying for the "more difficult" one?

Okay...that I did not understand. I thought that since it is in the Neighbor table that this was the way it was communicating with the hub.

What really ticks me off is that for some reason, the plug that had the terribly low LQI is now showing an LQI of 250...so it obviously doesn't need a repeater to get to the hub. :man_shrugging: Sometimes you just can't win. lol

I do appreciate the explanation. It makes a lot more sense now.

1 Like

Ryan,

Could there be something else other than wifi causing RF noise affecting the plug with intermittently low LQI values? Microwaves, Bluetooth, motors, fluorescent lights, cordless phones, etc. all can jam up the frequencies used.

Not long ago I went to a customer site to troubleshoot a poor wifi signal and found the entire spectrum was flooded with RF noise. The source ended up being a cell tower several hundred feet away.