I was thinking of the scenario I listed above and you're probably right for the state of a switch it's probably not helpful.
Ah! I went back and re-read your post and I get what you were saying. Sorry about that.
Most of the bases have been covered here, so I will just add my personal approach for Z-Wave:
- Use compatible devices. Of the 2 Z-Wave devices that misbehave in my network, 1 is explicitly not compatible with C7 and the other is not yet on the compatible list.
- A small amount of troubleshooting may help to validate a quick-and-dirty band-aid. The 2 devices that I mentioned took a long time to update switch state. Repeatedly sending the same command would just be ignored if it thought it was already in the commanded state. It made more sense for me to send a refresh command a bit before or after a change.
As someone who uses Hubitat to control my HVAC, I am very concerned about the occasional missed message. I have a routine that runs periodically, like every 10-15 minutes, and re-commands the devices to the desired states. This is effective for me, at least with the devices and drivers I am using. It avoids something being in the wrong state for very long. I suppose this strategy could fail if the drivers assume the commands they issue are received and they don't re-send commands when they think the state is not changing.
That would be a pretty crappy driver, in my opinion.
When a command is issued on the hub a driver should ALWAYS send the command. With manual commands there should be no logic, just a send.
And that is how pretty much all drivers I've looked at work. Thank goodness.
Does anyone with more than a dozen or two devices NOT have reliability issues at times?
Back when I had about 15 switches everything seemed rock solid. But in the last few months I've added about 60 more switches and about 20 sensors. And now it's unusual for me to go a day without reliability issues.
I've spent many, many, many hours trying to resolve problems, move load off the HE hub (to node-red for logic and Home Assistant for display) without really seeing any improvements. I've bought 6 z-wave repeaters. But still, if I issue a good-night command to turn off all of the lights in my house (using Simple Automation Rules and Groups), they NEVER all go off. That's just one reproducible example - most problems are very intermittent and dispersed.
Is this to be expected? Does anyone have a larger installation that never or rarely misses commands? Just curious how I should set expectations.
I was having a problem today with a switched outlet. I finally tried switching it manually-nothing. Took a look at the clock radio-blank. A fancy arc detecting breaker had tripped, for no reason at all. What genius required those?
On another note, we had a 4 hour power outage today and my network seems to have survived intact (lucky). I just counted: 68 z-wave and 16 zigbee devices.
I put several automations that are straightforword on associations.
I have a decent sized installation that never (or very very rarely) misses commands:
74 devices on zigbee2mqtt (of which ~67 are brought into Hubitat as virtual devices)
22 zigbee devices on Hubitat-1
12 zigbee devices on Hubitat-2
76 z-wave devices on Hubitat-2
38 virtual devices on Hubitat-2
95% of my automation is done on Node-RED.
Edit: Missing commands is rare enough that the last time I recall an automation failing was when a motion sensor had a dead battery. Off course, that was the last thing I checked ..... dumbass that I am.
We have reason to believe that Groups and Scenes can both overrun the radios. We have an interim fix for that coming in the next release, that allows metering the rate at which devices are hit by these apps. If an app throws out a large number of commands in quick succession, the radios can drop some of them. Hopefully we can address this eventually in the radio stacks, but some of it is happening in the RF domain, and isn't immediately obvious in software -- other than the command getting lost. Both Groups and Scenes are quite simple when it comes to issuing device commands -- that is to say, they just blast them out fast.
Beyond that, I have seen Z-Wave sporadically drop a command -- it may be related to the above. Mesh issues can be lurking.
One thing I find very interesting is that people will post about this or that app failing to do something, pointing a finger at the app as the obvious culprit. Almost always it turns out to be device related (exceptions do happen -- someone just brought a bug in RM to my attention a few minutes ago). When I say "device related", that sweeps in a whole gamut of issues, including device specific issues, driver issues, mesh issues, etc. These are not easy to pin down.
Despite spending many hours trying to figure it out, I can't say with any certainty where my problems are. I've spent hundreds of dollars replacing devices which show errors or have long latencies (learned from the Hubitat Z-Wave Mesh Details App). More money has been spent adding repeaters. I've studied the logs and tried to reproduce problems for hours. I've written a ton of Node-Red and Home Assistant components to try and reduce load on the HE hub and make it as dirt simple as possible. And to be honest, I still feel clueless and don't know enough to point any fingers or pick a culprit. I really wish I could! Maybe the problems really are in the RF domain?
I know there are probably many other higher priority projects around, but some software tools and well-written practical guides on how to debug and improve the reliability of our systems would be much appreciated. Today, I don't think most of us would feel any confidence in finding and removing "lurking mesh issues."
I'll keep trying!