So, I've long had a suspicion there was something wrong with either this device or a repeater between these devices. I bought a cheap sniffer to scout out what was going on and it was so cheap it made me come to the wrong conclusion.
I next built a zniffer (now that a lot of the silabs.com tools seem to be available without a NDA) and it painted a different story. I got this capture:
What you are looking at (I think, since I haven't been using the Zniffer software very long) is a BinaryGet from the hub to outlet (node 37). It has to jump through node 42 and 9 to get there. Once it gets there the outlet initiates the Ack back through all of the routing devices and then the hub says Ack back to the last routing device. That must be the protocol on the Acks. After the ack comes back through there is almost like a thank you on the last leg of the Ack back to the last hop.
Anyway, that looks all fine.
What doesn't look fine is that the outlet (again, node 37) turns around and initiates not one, but two, binary reports back to the hub even though there were no issues with the first one. So, the log reports that the switch is on twice.
This was a small but super clean example of what I'm seeing large scale sometimes with all of my outlets of this model and firmware.
Sometimes, instead of just a duplicate report they send multiple duplicates and even to the wrong devices and even switching the command completely. When this happens everything starts talking and sometimes I have to bring the hub down for the all of the devices to settle down.
When it happens it appears that the outlet doesn't keep talking during the entire collapse. It just instigates and then watches society collapse.
RSSI is 50 to 69. I don't think that is the RSSI that the devices see between each other but what my zniffer module sees from snooping the packets. Either way, it feels high enough because I'm able to see everything sent without corruption. First question, is there a way to ask a device what the RSSI is for its neighbors?
Second question, in the protocol is there any reason for the outlet to do this? It got a successful Ack from the previous binary report. I can't see any reason for it to send another one.
If it is part of the protocol I want to point out that I don't see it doing the second report every time. Also, that means the hub isn't handling it correctly and shouldn't have reported the second report. If it ISN'T part of the protocol, I don't think we can blame mesh in this case because we see the Ack messages sending and receiving correctly. Do we blame firmware? Are any other users with duplicate messages in the logs using these GE outlets?