Thoughts on bandwidth usage

Many of us have probably seen postings from time-to-time about overloaded z-wave meshes, and need to limit power reporting, etc. I was interested in looking at exactly what this means, so I thought some z-wave numbers analysis (albeit simplified) as to the number of packets that could be sent in a z-wave network might be interesting.

  1. I have a Z-wave network with 100 nodes, looking at the bandwidth and hop count, I found I have an average Hop distance of 1.78 hops, with an average bandwidth of about 80 kbps.
  • On the hop count, more specifically, 47 nodes had direct (1 hop) connection to the hub, 51 had 2 hops, and 20 had 3.

  • Interestingly, a large number of the "multi-hop" devices and "lower speed" connected devices were battery operated - for example, 10 out of 12 of my HomeSeer leak sensors were multi-hop as were 4 out of 5 door locks.

  1. Now, Z-wave is a CSMA/CA type network (Carrier Sense Multiple Access with Collision Avoidance), like WiFi. The CSMA/CA algorithm tries to monitor and sense when other nodes are transmitting to avoid problems, but even so,, you can't ever achieve "full" efficiency of the network due to some packet loss, collisions, etc. . There are a lot of papers out there modeling efficiency of CSMA/CA networks (considering collisions, packet loss, retransmission, etc.), though I didn't find one specific to z-Wave, a rough guide is you can get an efficiency of about 50% as the network gets larger. A Hubitat hub should be able to transmit much more efficiently - it knows when it is already sending another packet, so it should not cause self-collisions; the collisions would mroe likely be on the acknowledgements and hops. So given that 50% efficiency of a channel, that average channel bandwidth of 80 kbps results in more like 40 kbps that can be put to use. Drops fast!

  2. But wait, modern z-wave chips communicate on multiple channels simultaneously - I think there are 2 channels in the U.S. (actually, it might be 3, but looking at worse-case of 2), so different nodes can use different channels and can round-robin between them -- big advantage - and we're now at 2 channels at 40 kbps - i.e., back to the 80 kbps total. Basically, the multiple channels counters the losses, collisions, etc. - how convenient. Update: I checked my z-wave logs and 3 channels are used in the US, so the performance could be better than I'm estimating here.

  3. So, divide by 8 to get capacity in bytes, and that's about 10 kBps as the expected throughput on a Z-Wave network using 2 channels after accounting for collisions, loss, retransmission, etc.

  4. Now, how many packets per second can be transmitted with that 10 kBytes/s - well, z-wave uses the G.9959 physical layer and each packet starts with a preamble of 24 bytes -- used to synchronize receivers and transmitters and for the CSMA/CA algorithm to detect that a transmission is about to start -- , add to that a start-of frame and end-of frame, and you got about 26 bytes of air time before you even transmit anything useful.

  • Then start framing your data - add in the HomeID (4 bytes), source ID (1 byte), Header (1 byte), routing info (up to 4 bytes), and data length (1 byte), and you have another 11 bytes -- so about 37 bytes of air time just to get started.

  • Now the rest is relatively small - a simple command like an on-off for a device is only about 2 bytes, while a complex report -- such as a power or sensor report can be more ( about 15 bytes for a Meter Report using 4 byte meter values.

  1. So, look at worst-case packet size -- i.e., for a meter report -- you get about 52 bytes of air time used up between the preamble, basic framing, etc. But let's double that - every time you send, you also get an Ack coming back the other way. Ack frames are much smaller, but let's say that the sending and Acking of a packet uses up about 100 Bytes of air time for both the send and ack.

  2. But wait, my average hop count was 1.8, so each time something gets sent, multiply by 1.8 since that's the average number of times you repeat it - thus, that is 180 Bytes of air time gets used for a transmission (again, leaning toward the worst-case use of larger packets, etc.)

  3. Looking at the 10 kBytes I found in 4, and considering a 180 Bytes of total air time for a packet after you look at headers, loss, retransmission, etc., should allow about 55 separate commands or reports (together with their acknowledgements) a second. Really quite a lot for a home and gives a bit more perspective on how hard it should be to really overload a z-wave mesh.

  4. Now, let's say you had 50 devices with metering, and they reported voltage, current, and power - 3 reports. If they report once every 10 minutes, that's 150 reports / 10 minutes, or about 1 every 4 seconds. I.e., that should have almost no impact given the ability to send about 55 reports / second. At 1 report / minute, you'd have about 2.5 reports a second - still should be OK. On the other hand, if you had those devices reporting once every second, that would be 150 reports a second - and your network would fail.

  5. My simple conclusion - it should be very very rare that anybody has to worry about having enough transmission bandwidth in Z-wave. 55 reports / second is quite a lot and unless you've pumped up the power reporting to ridiculously short periods between reports, you should be fine.

5 Likes

A few questions, to understand your real-world experience:
(I don't have any real-world experience in this matter, because I've been scared off from using power reporting.)

  1. How many devices do you have doing power reporting?
  2. How often?
  3. How many hops do those devices have?
  4. Which devices are they?
    (I hope that I'm not prying into sensative areas - it's just that I've been scared off from doing any power reporting, so I'd like to hear some "success stories".)
2 Likes

About 10. I didn't separately look at the hop count, but on an average / expected basis, the calculation I did above would account for that. Minimum time of about 10 minutes apart for power reports. But the actual number of reports tends to be quite low - generally, its a "at least every 10 minutes", but the devices generally also don't report unless there is a change in power, so devices that are off, or just remain "on" in steady-state don't report much.

Historically, the only device where I saw many reports was the zooz Zen 25 when I used one to control a water pump - in that case, the power usage of the motor would vary by small amounts triggering many reports, and where I used one to turn on/off a pinball machine (many little power variations and bumpers are hit and flippers flipped during play). Even then, it was far less than the "55" number that I calculated as an upper bounds and everything seemed to work fine.

A good guideline - look at the Z-Wave logs page and see how many total Z-wave messages are being processed - if you're getting a few a second, you're in a "safe" zone. If you're getting dozens, then you need to trim back the power reporting. Interestingly, the Z-Wave logs page also displays the transmission channel for the messages - in the U.S., you'll see that 3 channels are used and the transmitter switches between the channels (my calculation, above, assumed 2 channels - I was wrong, there's 3, so even more bandwidth.

Of course, this only goes to the Z-wave transmission ability. If you have lots of apps that are looking at each event, then processing speed for the messages (as opposed to transmission capability) could be a limitation. Though I haven't seen that happen (Hubitat seems to have lots of spare processing capability).

1 Like

Have to say this is great, an excellent analysis.
Like @jtmpush18 and considering all the issues I've had, I have power reporting off except for the washer & dryer.

Side Note- actually power reporting on the Zooz Power Switch for the washer just helped me avoid a disaster. I noticed the power spiked to over 15 amps, which is unusual for my washer. When I investigated I discovered the tub bearing had ruptured slinging grease all over the underside of the washer. Had it not been for power reporting I would likely have not noticed until the motor burned out. It's an old Maytag from the 70's and the PSEG repairman looked at me and said' do not ever get rid of this washer" and I asked why, he said "they're built like tanks and last forever"

Thread derailing done...

Thanks for the very useful analysis!

1 Like

The thing with power reporting is that it has to be targeted with appropriate values. The problem is that these values aren't always very clear. This is just as much a device problem as it is the concept of power reporting.

The ways I have seen power reporting implemented have fluctuated based on device.

Zen25 has the ability to report power based on change in power usage or Interval. One of the Interval option can not be turned off and often reports more than one item at that time. If you set the power usage threshold to low you can trigger a rush of zwave messages as fast as the reporting machine can spit them out. I have seen my zen25 with poort settings trigger easily 10-15 reports in a second between the various reports it can do

Zen15 not only uses power threshold and interval, but also has a % change option. This can add to the complexity of determining how many calls it makes.

Zen04 has so far hasn't been to bad as long as you don't depend on it to detect zero power. I had one reporting every 2 seconds continuously while it had 0 load. It was reporting 0 then .04 then 0 then .04. This triggered other things to report at the same time as well. When you put load on it it didn't do that and only reported when it should. Since this is a early release device it is getting firmware update to deal with that.

The point I am trying to make is that it wouldn't take to many of these with poor settings to easily overwhelm a zwave network if other traffic goes up for a moment. I think if things are configured reasonable and you aim your configuration to capture what you need then it probably isn't a problem. but if you set tight intervals, and low power wattage thresholds it wouldn't take many devices to easily create 30-40 a second.

I do think Power reporting has gotten kind of a bad rap, but the devices don't help themselves by making sure they don't do bad things either.