Been building out my system and noticed some erratic behaviour related to turning on dimmer switches as an action in a rule. Using Zooz Zen77 (dimmer) and Zooz Zen32 (scene controller) devices.

I had a rule that the action was turn on dimmer A to 100% and then dimmer B to 100%. I found through experiment sometimes only one device turned on when the rule was run. It had a randomness to it so I was exploring ZWave issues and RF Link concerns. I had great signal and single hop back to Hubitat so I doubted that was the problem. I was about to re-pair the devices with S2 encryption (I followed Hubitat's recommendation to not turn on security when I originally paired the devices).

Dim: Kitchen Island: 100
Dim: Kitchen Ceiling: 100

Then I changed the rule to turn on switch dimmer A and dimmer B, Then I followed it with the action dim

On: Kitchen Ceiling, Kitchen Island
Dim: Kitchen Island: 100
Dim: Kitchen Ceiling: 100

Now the rules seem to run perfectly. I haven't been able to get it only turn on one of the lights and not the other which regularly happened before.

So while I am happy my problem is fixed I'd like to understand why. Is it possible you shouldn't issue a Dim: set_level without knowing the current state of the switch? Is dim:0 the same as switch:off? I had assumed switch:on just returns to prior dim set level. Something subtle here but I am not seeing it. Is switch:on the same as a single tap on the upper switch paddle? Is switch:off the same as a single tap to the lower switch paddle? Is dim:100 the same as a double tap on the upper switch paddle? Just wondering if there are subtle differences here.

With most drivers, no. A "Set Level" would also turn it on if off. The big exception (besides a possible device or driver problem) is if you have any "level prestaging" options on the driver that are enabled. This is not common.

Normally yes, though it's a bit of an odd case and some devices or drivers might do odd things in response. I'd use "Off" if that's what hi really want just to be sure.

Some devices have options you can use to change this, but this is also normal behavior.

Both sets of these can be different in that one works via Z-Wave from the hub while the other is an on-device feature (and the double-tap thing in particular a device-specific feature, not anything standard). But the outcome might be the same, or supposed to be.

If none of this explains what you're seeing, it might be helpful tonshare what driver you're using and any options you may have configured on the device (e.g., device detail page preferences).

This is the driver I am using. It is quite nicely tailored to the ZEN77. The author (jtp10181) made a tweak at my request (hence 2.0.3) that fixed a dashboard issue I was having.



[2.0.3] - 2024-04-XX (@jtp10181)

  • Added singleThreaded flag
  • Added option to set level to 0 when turned off
  • Updated Library and common code

With my driver both On and SetLevel send switch multilevel commands. The only difference is that "On" sends the value 0xFF (255) which is a special value telling it to turn on to its default/last level (depending on firmware config). Using setLevel(100) obviously will send that value instead of 0xFF.

So, you should not need to turn it on before the level. I am guessing just sending the device two commands is getting it to pay attention. Happens sometimes with z-wave if you dont have a strong mesh. If I get the supervision support built out better that might take care of it.

I am not thinking it is zWave signal related. The Hub is 6'-8' from the switches in question, path stats are good, its a C8 Pro with external antenna, and I don't see similar rogue behavior with other rules that run on those same switches. I think it is a logic flaw somewhere such that the command to turn on gets dropped for one of the devices being controlled.

I am just surprised Hubitat doesn't have a more robust retry scheme such that it end to end confirmation of actions occur. Or at a minimum a flag/indicator to say the action failed.

My work around doesn't cost me anything so I'll just leave it in for now.

Your advanced drivers for all the Zooz products are terrific!

It is left up to the drivers and it is more complicated than you would think. Especially when you possibly have multiple command queued up. I have been kicking around some better ideas in my head. You can turn on the Supervision option in the driver to try out what I have in there, to see if that helps at all. Its not great but it makes an effort.

Just an update on this subject. I was working this with the Zooz support people but they seem to think the ZEN77 isn't the issue. I am less sure. I get the usual RF is voodoo type of blanket response, we've got a zillion installed and they all work, must be your issue..... While I am new to Hubitat, I am 10+ years with Vera and never saw this sort of thing there.

Here is what I know. Sometimes a ZEN77 fails to act on the first command to it when it has been idle (and off) for a few hours. My work around was to tickle it with two commands back to back and that seemed to work better but adds clutter to rules and seems a kludge of a fix.

The ZEN77 in today's rogue behavior is about 8' away from the Hubitat hub. It is colocated in a 4 gang box with two other ZEN77s. As noted signal is strong and it is organized as a direct RF hop from the hub (01 is the Hub). The devices where I most often see this problem are where there are multiple Zooz devices beside each other in a multi-gang box. I wondered if there was some sort of RF contamination between the devices (known as RF desensitization) but Zooz quickly dismissed that.

  • These are the device specs for the ZEN77 which didn't turn on first try today:
  • 0x002a (42)
  • PER: 0, RTT Avg: 2ms, LWR RSSI: 39dB
  • Neighbors: 21, Route Changes: 8
  • 01 -> 2A 100kbps

I am using the driver named below.

So what is different about this instance is that it was a manual activation of the Zen77 as opposed to a rule action. In the picture below (left side) the slider was moved to the right but the light never turned on nor did the icon turn green as it should have. When I turn all three off with my all off rule then try it action (manually one switch at a time) all three turn on finr (right side of screen).

Zooz said to try the basic Hubitat driver but I haven''t done that yet. The basic driver lacks many features of the jtp10181 driver.

Since this is random its not that easy to diagnose. Its quite annoying though. I remain puzzled why there is no confirmation handshaking to ensure the request for action is faithfully confirmed by the end node being acted upon. That seems a basic functionality requirement. User Apps shouldn't have to police if lower level commands are executed. I had thought the problem was related to how the rule sent out the actions, but this was a manual actuation of the switch so seems to disprove that hypothesis.

 *  Zooz ZEN Dimmers Universal
 *    - Model: ZEN22, ZEN24 - MINIMUM FIRMWARE 3.07
 *    - Model: ZEN27 - MINIMUM FIRMWARE 2.08
 *    - Model: ZEN72, ZEN74, ZEN77 - All Firmware
 *  For Support, Information, and Updates:
 *  https://community.hubitat.com/t/zooz-zen-switches/58649
 *  https://github.com/jtp10181/Hubitat/tree/main/Drivers/zooz
Did you ever try enabling the supervision option in the driver as I suggested?

Not yet. I was willing to live with the issue by double turning on devices in my rules. That seemed to work.

Just recently I noticed the problem occurred when I turn on a single device manually via the dashboard. I wasn't thinking that was a problem before and as such this has bubbled this up again.

I think I will try to do a control group and two experiment groups. I have enough devices in my network to make this useful (15 dimmers, and 12 on/off switches). I'll leave 1/3 with your driver unchanged, 1/3 with your driver with supervision enabled, and 1/3 with the native Hubitat (less capable) driver. Perhaps that will show something.

I much prefer your driver to the native Hubitat one as I use several of the parameter options which aren't exposed in the native driver (although I suppose I could cryptically set the parameters with parameters and hex codes).

What exactly does supervision do? Any side effects I should expect (delays, more verbose logging, ...)?

p.s. Zooz also suggested I repair some of the devices as LR so they skip the mesh network and go hub and spoke. That seems a lot of work and I see in other threads LR still has some teething pains a-la backups and such. What say you?

I'll defer to Jeff on any yay/nay reccomandation for LR, but if you decide to give it a whirl, make sure those devices have latest firmware updated first.

One issue with LR currently is that there's no way to do (device) firmware updates.

That's a minor issue compared to some of these other ones currently, but it's still a consideration.

I have numerous devices paired LR, and my experience has been very good so far... Knock on wood, I haven't had a need to mess around with cloud backups, so I have no idea if that bug would bite me too or not. I'm just really hoping it gets sorted out sooner vs later in case I do need a cloud backup.

Crap, I probably just jinxed myself.

It wraps the message to the device with a header that tells the device it is a supervised message. When the device gets that it is required to respond back with the same sequence number and a code indicating if the action was taken. So if no response comes back then you can code the driver to send a retry. If you turn on debug logging you will see additional logs and also logs if there are retries. My implementation is pretty basic based on some code the devs shared a while back. Might be worth exploring again especially if it does help in your case, which is sounds like it should.

A lot of the system drivers have had this added but it is not documented anywhere and just on by default (but only if device is paired with security). So it could possibly be enabled on the built in Zooz drivers but you cannot tell unless you start seeing warn logs saying it is retrying.

I was doing some testing of my all on and all off rules and noticed sometimes a device here are there did not turn on or turn off when the rules was run. The rule All On simply turns on every ZWave device one by one and the All Off turns them off one by one. They are both triggered by virtual switches named the same as the rule. Simple.

I had a case where the All On failed to turn on the porch light. 03:45:01.003 So I waited a bit just to see if it was a lagging rule. The porch light didn't light up. So then I ran the all on rule again (03:45:23.506) still didn't work. Again it failed at 03:46:45.915. Finally at 03:47:06.471 another run and it worked.

So my question is who is populating the log? The hubitat system or the driver? Is a log entry to note a round trip confirmation (action completed) of the event occurring successfully or merely a command dropping in a queue to later execute (action requested)? The point being if it is just the command being queued up then it would suggest the rule engine isn't faithfully sending the rule actions reliably. If it is a confirmation of the action being successfully executed with round trip confirmation it could mean the device, RF path or driver is at fault.

Well you cut off the part of the log that says what created the log, so...

Usually for a device to log that it is "on" that means the device itself replied back to the hub saying it is on. If the log comes from anything else it is possibly just letting you know it is sending the on command.

On the left side of each log entry, the hyperlinked number identifies what generated that entry (app, driver, or hub)

OK in my effort to make the log more readable I didn't include enough of the log.

Here is more data with some highlighting to indicate the main events. OK I guess not I can't upload a pdf. That's very 1990s.

OK I'll just paste the PDF file and lose some of my spreadsheet formatting. More difficult to read.

I don't understand the dev:2 vs DNI 07 number scheme but oh well.

If you could physically see that the device was not turning on then either the device was never getting the command, or the device was having some sort of malfunction. More likely that it just did not get the command from the hub. Looks like you flipped around 7 devices on and off repeatedly, that alone is a good way to cause problems.

Yes I agree flooding the system with ZWave command is hardly a typical usage profile but I am trying to stress the system to better understand why some devices fail to act on a command. Since the issue is an intermittent by definition it isn't readily duplicated.

At this point it is unclear if the rule machine is issuing the commands or if it gets bogged down and somehow drops commands, or the command is sent over the RF channel but doesn't toggle the node because of RF interference or a channel collision, or there is a logic bug in the driver which causes some messages to get lost, or the Zooz device has some issue in receiving and acting on commands.

I spent 40 years in telecom and learned long ago that coding for the nominal case was about 35% of the effort and coding for all the error traps and dealing with degenerative cases was the rest of the work. Further that systems expose such flaws when they are scaled upward with transaction traffic.

I am going to split the device population into the three bins I mention before (same driver no change, same driver turn on supervision, and lastly the default hubitat driver) as I stated before and see if that correlates to a pattern. I'm 1300 miles away from the system under test so a bit reluctant to make major config changes remotely as I cannot un-pair and re-pair from afar if something goes amiss.

Unreliable ZWave end nodes operating under rules is why I left the Ezlo Eco system. They had even more erratic performance and no concrete plan to correct it. Hubitat/Zooz is much better but still has gaps. My old Vera system (using for 10+ years) has All On and All Off "scenes" and they run successfully to completion everytime, 100%. Sometime they slow down but they always run to successful completion. My ZWave network on the Vera is also more elaborate (more devices) than I have on the Hubitat/Zooz system. So I don't know what type of handshaking Vera built on top of the Zwave layer but they obviously put in some sort of confirmation handshaking device by device.

I guess worst case I can continue to set up rules to command the device twice after a delay. That's a pain to do but might be the final answer. Then when I command via iPhone or wall scene controller buttons (Zen32) I'll have to accept that sometimes the action won't work and you'll have to press the button a second time. Just seems an unnecessary compromise.

There is no reason this needs to be unclear; enabling Action logging for the rule will show you what it is doing when, or if you don't want to believe that (it should be accurate but technically isn't an authoritative source at the platform level), the Events page for the device will show what commands were sent by what app (you'd be looking for commands specifically, not events).

The only way to tell for sure is a zniffer setup. But I think if RM or the hub was failing to send the commands out at all, we would hear more about it. I suspect it is a localized mesh issue.

If you cannot visually confirm if the device is acting or not then it is more of a challenge to troubleshoot. One thing you could do is for the device in question using my driver, enable debug logging. When you have an apparent failure to update the state issue a refresh from the device page before trying anything else. If the device responds to the refresh either way you will get a log telling you the response (may be debug log if the state does not change).

Nope, not in my driver anyway. If you are concerned turn on Trace logging which logs every single incoming message nearly as soon as the driver receives it. You will find that every inbound message has action taken (may only result in a debug log saying nothing changed).

Unlikely to be a Zooz firmware issue or we would hear more about it.

Probably just keeps hammering away until the device responds. Which is just a band-aid for poor mesh performance. Same as what the All Off app does but I think it eventually gives up so as to not be stuck an infinite loop of retries. There is nothing else more magical that you can do that would make it work any better.

Yes it is unnecessary if you have a fully functional mesh. I rarely have to run any actions multiple times to get results. The only time it may happen is if one of my larger automated shut down rules just ran and I go to turn a light back on, sometimes it gets very delays due to heavy traffic of the shutdown and locks.

If you really wanted to, my drivers could be modified to just send the commands out twice each time you press On.

The issue with actually coding in retries gets very complicated because what if the user clicks on, and it does not work instantly so they click off then on again and then double tap up twice. Do you retry all of those commands causing the light to possibly be flickering on and off, or how, in the code, do you decide what the user wants to do?