Need Some Help Understanding Z-Wave Times and Routing

A long post, but I need help...

I have a relatively simple "Simple Automation Rule"

  • When virtual contact opens, turn on 5 lights
  • When virtual contact closes, wait 1 minute, turn off 5 lights
  • Only do this when it's dark (sunset -30 mins to sunrise +30 mins)

The purpose is to light the way when it's dark.

I am having difficulty where I will have sporadic performance of this rule. Sometimes, all the lights fire "instantly" (as in no noticeable delays)...this morning, had issues 2 times. As a note, all lights respond immediately when I turn them on/off through either the dashboard or within the device page.

Here is the performance (from device logs) this morning when I left and when my wife left:

Case 1:
07:00:56.113: Virtual Contact Open
07:00:56.408: Kitchen Hallway On (OK)
07:00:56.410: Foyer Light On (OK)
07:00:57.542: Front Door Outside Light On
07:00:59.586: Master Bedroom Hallway On
07:00:59.583: Laundry Room Light On

07:01:01.609: Virtual Contact Close
07:02:01.914: Kitchen Hallway Off (OK)
07:02:01.948: Foyer Light Off (OK)
07:02:02.811: Front Door Outside Light Off
07:02:04.619: Master Bedroom Hallway Off

You will notice that the Laundry Room Light never registered off...even though it turned off about 5 seconds after the Master Bedroom Hallway.

Case 2:
07:06:54.686: Virtual Sensor Open
07:06:54.861: Kitchen Hallway Light On (OK)
07:06:54.904: Foyer Light On (OK)
07:06:55.643: Front Door Outside Light On (OK-ish)
07:06:59.472: Master Bedroom Hallway On

07:07:01.250: Virtual Sensor Close
07:08:01.548: Kitchen Hallway Light Off (OK)
07:08:01.573: Foyer Light Off (OK)
07:08:02.266: Front Door Outside Light Off (OK-ish)
07:08:03.690: Master Bedroom Hallway Off
07:08:04.157: Laundry Room Light Off

In this case, the Laundry Room Light never registered as on, even though I saw it turn on just as I was closing the door.

Z-Wave Details:
I have a strong mesh with 58 mains powered devices and 1 battery powered device. I have 1 ghost node that I haven't gotten around to removing with my Z-WAVE USB key.

Kitchen Hallway Light (0x3F): Direct (9 dB)
Foyer Light (0x36): Direct (-6 dB)
Front Door Outside Light (0x42): Direct (-6 dB)
Master Bedroom Hallway (0x41): 01 -> 12 -> 07 -> 41 40 kbps (-2 dB)
Laundry Room Light (0x3E): 01 -> 20 -> 3E 40 kbps (21 dB)

I did a repair (multiple times) on the Master Bedroom Hallway and Laundry Room Light, and they will not choose different paths. The Laundry Room Light is physically about 10 feet away from the Kitchen Hallway Light (which is direct connect to the hub...about 8 feet away).

Here are the paths for the repeaters:
0x12: 01 -> 21 -> 41 -> 06 -> 12 9.6 kbps (25 dB)
0x07: 01 -> 41 -> 06 -> 07 9.6 kbps (3 dB)
0x20: Direct

0x21: Direct
0x41:
0x06: 01 -> 41 -> 07 -> 06 100 kbps (no RSSI)
0x07: 01 -> 41 -> 06 -> 07 9.6 kbps (3 dB)

The thing is...0x06 is a Zooz multi-relay (Zen something...has 3 relays) and creates a contact closer when the door is open, which fires a virtual contact sensor. This is ALWAYS virtually instantaneous...even though it's MUCH farther away from the hub than the Laundry Room Light.

If you look at some of these paths, it looks like there is a Z-Wave message loop? Am I interpreting this correctly?

For the Master Bedroom Hallway, it's 3 feet away (the other side of the hall) from the Foyer and Front Door Outside Lights...these direct connect. However, the Master Bedroom Hallway doesn't route through these, it goes FARTHER from the hub to 0x07, which then routes to 0x12 (closer than 0x07, but still farther than 0x36 or 0x3F), and then routes from 0x12 back to the hub...and the route from 0x12 is MUCH FARTHER than a direct connect would have been for this switch to route to the main hub.

Can I force a Z-Wave route?

I never had these issues with my previous SmartThings...same Z-Wave devices, locations, and rules.

I have attached a Z-Wave floor plan showing positions.

Key:

  • Green = Z-Wave Plus
  • Blue = Z-Wave

Hubitat hub is the black circle to the "right" of the kitchen.

Mesh speed is impacted by two things, the 'baud rate' (9.6k - 40k - 100k are the fixed speeds of communications) and 'quantity' of packets.

The more hops, the more total packets. Hub to hop1 to hop 2 to target device puts 3x the number of packets on the mesh vs Hub to target. A 9.6k packet uses 10x the time of the same packet at 100k. It 'blocks/delays' other packets til it's done.

Z-device packets are tiny and the airwaves are vast. Many homes, mine included have many ZWave meshes coexisting. I have between 3-5 ZWave meshes running all the time. They don't interfere because of the giant 'gaps' between packets. Z-device packets are unidirectional or what is often called half-duplex. One Z-device will send a message and it won't start another til it gets the ACK. The series 700 chip has reserved the first 5 node IDs for itself. This offers the potential to have multiple parallel conversations, although in practice, I don't actually see it much. I see IDs 2-5 being used for Includes/Excludes when I happen to be looking.

Therefore, you may be encountering problems due to ANY or ALL of the above. :smiley: (Well that doesn't help he says :slight_smile: )

One of the classic mesh crippling situations we see, from reading hundreds of posts here in the Community, is from Power Monitoring sensors. Maybe you have a Washing machine plugged into a power monitoring outlet and it's just flooding the mesh with minutia on how much power is being used. In that example, do you really need to know that it's using 15 watts, oh 16 watts, oh 14 watts ?? These devices usually have a means of reducing the number of reports... they typically offer 1) a report on a set time schedule and 2) a report on a change in value. I believe most default to BOTH and it's usually mesh consuming to leave it that way.

It's an investment, but you can always look into building a Zniffer to allow you to literally watch the flow of packets and see what's occurring in that space before the Laundry light.

Z-device routing is really hard to visualize and even harder to comprehend most of the time. The mesh is trying to be 'big enough' to encompass the current AND the future area your devices might cover. In other words, a bunch of close in devices may not have any 'value' to the mesh because the next most distant 'ring' of devices can 'hear' a larger quantity of devices AND be able to reach further towards the next 'ring'. Two devices side by side is a human perception that is not identifiable to the Hub. The hub asks for a list of neighbors from EACH device and certainly devices side by side are neighbors. but there's nothing to indicate they are practically touching. 2 inches or 15 feet appear the same to the hub. Opposite sides of the room appear the same as when they are touching.

Thanks for the response!

I don't have any Z-Wave devices like this. The majority of them are switches, with the exceptions being:

  • 2 Zooz Zen 16 MultiRelay (Z-Wave plus)
  • 2 Honeywell Thermostats (Z-Wave)
  • 1 Kwikset Lock (Z-Wave...joined S0)
  • 1 Go Control Garage Door Opener (Z-Wave...joined S0)
  • 1 Go Control GD00Z-8 Garage Door Opener (Z-Wave plus...joined S0)
  • 1 Tilt Sensor (Z-Wave plus...only battery powered device)

I have never run a full Z-Wave repair in the Hubitat (still relatively new)...but have clicked the repair for individual devices.

OK - did some more playing around.

I can report that the new Group Metering feature set to 50 ms solved all my issues with this.

1 Like

OK - I spoke too soon.

I am convinced that there is a firmware bug with outgoing Z-Wave messaging...inbound always works, but outbound appears to get clogged

Still having issues with lights triggering from automations. Here is what I have done so far:

  • Moved automations from Webcore to Hubitat apps
    -- Motion and Mode Ligthing
    -- Rule Macchine
    -- Simple Automation Rules

I was still having intermittent issues. So I removed my last ghost node, and increased the group metering delay to 75 ms.

It's "more better", but I still have occasional issues.

The thing is - inbound Z-Wave messages ALWAYS arrive...it's the outbound Z-Wave messaging that appears to get "clogged" at the hub.

The Z-Wave plus "explorer packets" appear to work. If I "repair" a z-wave plus node, nothing changes. If I pull the air gap on a repeater node, the outbound message will reroute...eventually.

Sometimes, the hub appears to get "clogged" without outgoing requests (in this case, from the dashboard). When they get "clogged", simply physically flipping the switch on and off unclogs them...as inbound data never appears to get clogged.

I have 1 Z-wave plus tilt sensor (for a garage door), and the messages form this sensor ALWAYS arrive even when the outgoing messages get clogged.

I had a situation this morning, while leaving the house, where outbound z-wave messages were delayed by 30+ seconds. Not much is happening during these automations:

  • Z-wave plus contact closure for security system arm
    -- Trigger the Arm routine
    ---- Say something on Sonos
    ---- Set 2 Z-Wave (non-plus) heating and cooling setpoints
    ---- Wait 5 minutes
    ---- Lock the door, close the garage doors, turn everything off in groups
    ------ Groups have 75 ms group metering
    ------ 1 second delay between groups

I made a change where I am delaying the Z-wave (non-plus) thermostat heat/cool settings now...we will see what happens.

Here is the data from this morning...normal event...wife leaving:

  • 06:40:56.214 switch on Laundry Room Light was turned on [physical] DEVICE physical (turned on by hand)

  • 06:47:50.025 switch on (A5->H5) Entry Doors: switch is on DEVICE physical

  • 06:47:50.247 contact open Entry Doors was opened DEVICE

  • 06:47:50.468 switch on Foyer Light was turned on [digital] DEVICE digital

  • 06:47:50.528 switch on Kitchen Hallway Light was turned on [digital] DEVICE digital

  • 06:47:50.715 switch on Front Door Outside Light was turned on [digital] DEVICE digital

  • 06:47:50.856 switch on Master Bedroom Hallway was turned on [digital] DEVICE digital

  • 06:47:58.345 switch off (A5->H5) Entry Doors: switch is off - DEVICE physical

  • 06:47:58.457 contact closed Entry Doors was closed DEVICE

  • 06:48:58.713 switch off Foyer Light was turned off [digital] DEVICE digital

  • 06:48:58.812 switch off Kitchen Hallway Light was turned off [digital] DEVICE digital

  • 06:48:58.979 switch off Front Door Outside Light was turned off [digital] DEVICE digital

  • 06:48:59.115 switch off Master Bedroom Hallway was turned off [digital] DEVICE digital

  • 06:48:59.348 switch off Laundry Room Light was turned off [digital] DEVICE digital

I was pretty jazzed...everything worked great. Now, when I leave, it messes up:

  • 06:56:32.761 switch on (A2->H2) ADT Armed Away: switch is on DEVICE physical
  • 06:56:33.869 heatingSetpoint 66.0 °F Living Room Thermostat heatingSetpoint was set to 66.0°F DEVICE
  • 06:56:33.933 heatingSetpoint 66.0 °F Master Bedroom Thermostat heatingSetpoint was set to 66.0°F DEVICE
  • 06:56:36.654 coolingSetpoint 82.0 °F Master Bedroom Thermostat coolingSetpoint was set to 82.0°F DEVICE

Where's the other setpoint change...34 seconds later!

  • 06:56:38.069 switch on (A5->H5) Entry Doors: switch is on DEVICE physical
  • 06:56:38.178 contact open Entry Doors was opened DEVICE
  • 06:56:49.194 switch off (A5->H5) Entry Doors: switch is off DEVICE physical
  • 06:56:49.288 contact closed Entry Doors was closed DEVICE

You can see that I waited 10 seconds for the lights to turn on! Can't wait much longer as the alarm is counting down

  • 06:57:05.818 switch on Kitchen Hallway Light was turned on [digital] DEVICE digital
  • 06:57:05.846 switch on Foyer Light was turned on [physical] DEVICE physical
  • 06:57:06.186 switch on Front Door Outside Light was turned on [digital] DEVICE digital
  • 06:57:06.312 switch on Master Bedroom Hallway was turned on [digital] DEVICE digital
  • 06:57:06.471 switch on Laundry Room Light was turned on [digital] DEVICE digital
  • 06:57:07.284 coolingSetpoint 82.0 °F Living Room Thermostat coolingSetpoint was set to 82.0°F DEVICE

The lights turn on approximately 30 seconds LATER then they were supposed to! Oh...there's the other thermostat setpoint...WTF?

  • 06:57:49.441 switch off Foyer Light was turned off [digital] DEVICE digital
  • 06:57:49.528 switch off Kitchen Hallway Light was turned off [digital] DEVICE digital
  • 06:57:49.710 switch off Front Door Outside Light was turned off [digital] DEVICE digital
  • 06:57:49.843 switch off Master Bedroom Hallway was turned off [digital] DEVICE digital
  • 06:57:50.110 switch off Laundry Room Light was turned off [digital] DEVICE digital

I also had my 2 Z-Wave (non-plus) thermostats set with the wrong device driver...I had a Z-Wave plus driver.

I just changed them to the Z-Wave (non-plus) generic Z-Wave thermostat driver.

I know, I know...not very scientific method like changing 2 things at once...

Well, not to be an arse, but unless you are sniffing the packets you really don't know, no matter how convinced you think you are.

You could be right, but no one else is reporting this. So if it is something unique to your install, no one will be able to prove it one way or the other. :man_shrugging: A sniffer is the only conclusive way to prove something like that.

2 Likes

Where do I find said Z-Wave sniffer?

Is this something you purchase, or is it software that runs with a USB Z-Wave stick?

There are a bunch of threads on here about sniffers. I'll post a link if I get time (am on my phone right now).

To sniff you use a USB stick + software on a windows PC. Sniffer needs to be placed in a similar area to the hub so that it 'sees' what the hub 'sees'. A laptop can be handy for this.

The nice part of doing it this way, is if you don't see any zwave packets when you are issuing commands on the hub then you can be 100% sure that the issue is on the hub side.

If you DO see outgoing packets, well then the issue could be a whole bunch of different things.

2 Likes

You buy a 'sacrificial' ZWave USB and flash it with Zniffer code. The code to flash it back isn't public therefore you'd be wise to capture and save the code from the sacrificial. After that it's just run the ZNiffer code on a PC and watch, interpret and adjust.

1 Like

I use one of these on a RasPi. The Z-Way image includes a nice browser-accessible Zniffer utility.

Z-Wave.Me RaZberry2 - Z-Wave Plug-On Module for Raspberry Pi (US frequency) https://www.amazon.com/dp/B01M3Q764U

2 Likes

I will check these out...thanks!

For what it's worth, I have the same behavior on my C7. About 100 devices, ~60 of which are mains powered (no ghosts). Reporting and status changes are always instant, but I am totally unable to control anything without a few hard reboots or waiting a day or two for random magic to happen. For example, my kitchen lights instantly show a state change when I physically tap the switch, but absolutely nothing will let me control the lights from Hubitat.

I'm planning on taking a day off next week to do some packet sniffing, since I have nothing but anecdotes, which I realize are not helpful.

1 Like

I am currently on 2.2.6.140...I am hoping the latest 2.2.7 release will address this...mainly this comment in the release notes:

  • C-7: Fixed a bug in Z-Wave Multi-Channel causing erroneous Application Busy responses to Multi-Channel Endpoints reporting state updates. This could have caused extra traffic and lost packets under moderate load.

or this one

  • Fixed a bug in Z-Wave supervisionv1 and supevisionv2 with null values causing invalid commandLength.

or this one

  • C-7: Periodic Z-Wave node neighbor updated to assist the SDK in making better routing decisions.

Haven't installed 2.2.7 yet.

2.2.7 appeared to fix things for me for about a day, but if we are experiencing the same thing then I have bad news for you :slight_smile: .

I suspect there isn't a bug per se, but that some super-chatty nodes that are (illogically) operating at 9.6kbps are at fault in my case. I'm planning on sniffing next week, but also know that I have some insanely chatty Zen15 and Multisensor 6 devices that are spamming the hell out of the network. It just seems weird that on a congested network, status updates are still 100% working, and everything outbound is failing.

One other thing that occurs when my network is in this state is that no repairs will work. Seems related, but I won't know until I can look for outbound packets (or lack thereof).

I just updated to 2.2.7.123.

Z-Wave appears to be better...although about 5 of my 59 z-wave devices went from direct connect to 9.6 Kbps through a couple of hops...hopefully this will work it's way out likes it usually does.

I created a group for the entry lights a few weeks back to test what's going on. This appears to be working well after the install of this new firmware.

For groups of Z-Wave devices, try putting them in groups and then turning on "Enable metering?". I set the metering delay to 75 ms and that helped with the previous firmware...things appear to be very snappy right now after the update.

Haven’t looked at mine for a while as everything is working, but in looking just now, while there are some interesting routes being taken, all of my devices except 2 (1 is a device limitation, and the other is in a highly shielded location) are at 100Kbps which is a large improvement.

@bcopeland must have been plying his magic in the last release or so.

1 Like

Why yes..

6 Likes