Unstable and inconsistent results all of the sudden

I have a C7 Hub that has been working fine for months. Been a few weeks since I touched/added any devices. My network is mostly ZWave with a handful of Zigbee. Primarily Zooz devices but some Jasco, Fibaro, or Aeotec. The problems I have seem to effect them all.

In the last 2 or 3 weeks I have noticed things intermittently not working correctly. I will issues a command either via Alexa or the Hubitat Web interface to a switch/group and either nothing will happen or only a percentage of the devices will respond. Sometimes the UI hangs a bit with the on/off button staying grey, other times the UI says a switch is on/off and it is not. Its intermittent and happening with many different devices but with out a doubt something will malfunction 1 out of 7 times. It may be anecdotal but it seems more prevalent with devices in groups. Often when a group is triggered 1 device will work and another will not.

I am on 2.3.5.146 firmware. It is not unlikely that these issues started with one of the more recent hubitat updates but because I tend to apply them quickly and there have been a bunch recently I cannot nail down which one or when other then "recent".

  1. So far I have shutdown my hub - left it offline for about 10 minutes unplugged - powered it back on
  2. Rebooted EVERYTHING (Threw the main in the house)
  3. Done a ZWave Repair before and after the total reboot

Here is a screenshot of what it happening. In this case I used Alexa to trigger a group called Bedroom lights that contains 2 switches. I turned them on and it worked fine. Within a few seconds I asked Alexa to turn them off. In this case neither light turned off. Looking at the properties of the two switches one indicates on and the other off. They are both actually on. I left them sit for at least 10 minutes with no change. I then hit refresh on the light showing as off and it correctly refreshed its state to on.

These results are typical of what I am seeing. Anyone have any suggestions or what I can do to troubleshoot this further? Below is my ZWave Settings page as well incase that might shed some light on something


Hit configure on the device page. See if the status updates correctly with physical events.

1 Like

Yes it does

First, for giggles, restore to the previous day's backup. This will ensure you have a clean database. (This is pretty much a soft reset).

Anything stand out as chatty in the logs?

On a personal not I would re pair all those s2 encrypted devices with no encryption (except for door locks). This would lower the overhead on the mesh (and really it's not necessary)

Is it the same device(s) that don't turn on/off?

Also for giggles, power cycle all your z-wave devices. One may have gone rogue and needs reset (switches can be air gapped).

I "air gapped" everything by shutting down the hub and throwing the main breaker in the house already.

Nothing is chatty in the logs at all. In fact if I don't trigger something it's unlikely I will have more than one or two entries an hour.

My behavior is intermittent and effects different devices at different times but is most prevalent with groups.

As far as the S2 security goes is there a good way to see if I have high CPU utilization or some other indicator that there is a problem? I would hate to have to exclude 25+ devices and rebuild all my automations (not that there are a ton) without at least some indication that was the problem.

Thanks

If hitting Configure fixes the issue, then I would suggest just doing that and see if you continue to notice anomalies. I don't know root cause, but had something similar occur with a few switches that had been connected for months. Hit configure (which essentially resets the device driver (well, when properly implemented)) and no issues after.

1 Like

Hitting configure refreshed the state correctly but the problem comes back.

Maybe I will just pull them all out, firmware update everything, and readd them without S2. If nothing else it will change a bunch stuff

I mean general speed of things working is what you will notice by removing encryption. There really isn't a CPU spike you will notice. But certainly encryption takes up a fair bit of bandwidth regardless if it's ip, z-wave, etc. Processes are dedicated to encryption/decryption.

As a counter point, I'm in the pro-S2 camp. I use it with every device that supports it.

The biggest benefit for me is that it uses full CRC checksumming even on non 100kbs links. That adds to reliability of messaging, and to me its worth it. I don't notice any performance issues at all, except when doing a firmware upgrade on a device.

1 Like