Z-wave crashed re-trigger rule every 50-seconds

I have a rule that notifies me if z-wave crashes. It happened last night on my C8-Pro. Unfortunately, the rule was re-triggered every 50-seconds, so by the time I woke up, I had received a lot of notifications. I've worked around this using private variables and won't get a notification more than once/hour in the future.

Is this a bug, or does it indicate that the system is caught in a loop of attempting to restarting the z-wave radio?

Thanks!

Marc

Might help to see a screenshot of the rule. Specifically the trigger. But, it is always better to show the whole rule. Depending on how you are determining the Z-Wave has crashed and the logic for the actions, the rule itself could be in a loop or the hub could be in a loop trying to restart. (Z Wave logs might be useful as well as hub event logs)

1 Like

@tray_e

Thanks for responding!

I have already updated the rule to prevent the looping. The screenshot below shows the rule, but with the new statements that prevent looping highlighted in yellow.

I've also attached an extract from the hub logs that show the crash rule being triggered every 50-seconds.

My experience is that z-wave logs can only be captured if the z-wave log window is open at the time the events occur. Let me know if there is a way to go back and access z-wave logs from "past" events.

I am using the legacy z-wave stack, as the ZwaveJS stack is laggy on my system.

Hubitat C8-pro running platform version 2.4.3.158

Marc

What is device2123?

The crash seems to occur 42 seconds after Telnet message. Every time.

1 Like

Device 2225 also at this point.

They are both instances of a device that sends an email message every time a z-wave crash occurs. It's the first action in the rule ("Notify Marc Phone..."). The device driver is called "LGK Sendmail V3". I doubt that is the cause since it does not involve any z-wave activity. It is talking to a local email relay over the LAN.

Marc

Interesting that they arrive 42 seconds BEFORE the event notification for the rule.

I do not think that the email driver telnet socket closing is the cause of the problem. Below I will put the oldest event as the first entry in the bullet list:

  • At 06:28:56AM the "z-wave crashed" event arrives, and the email is sent (2 log entries)
  • At 06:29:04AM the telnet socket stream is closed for the email message that was sent.
  • At 06:29:46AM the next "z-wave crashed" event arrives.
  • From here that pattern repeats.

The patterns I see are:

  • The trigger events ("Z-wave crashed") are all about 50-seconds apart.
  • The timeout of the telnet input stream for talking to the email relay happens about 8-seconds after the email device is invoked.

A bit more important background info:

  • I use that driver to send emails from other rules and it does not result in a z-wave crash.
  • I had issues with z-wave crashes happening about every 7-14 days before I added the rule to notify me about them. At that time there was not use of the email driver, but the crashes kept coming until I shutdown and power cycled the hub.

My conclusion is that the socket timeout is not the cause of the z-wave crash.

Marc

Seems your real issue is the Zwave crashing, not how many times it reports Zwave crashing, though that is a different issue in itself, but it wouldn't exist if Zwave wasn't crashing already.

There are lots of post here to get to the bottom of the crashes, but it is usually a device in your mesh that is causing it. I would focus on finding the source of the crashes instead of tweaking the rule that reports them.

We can't see the logs before the crash, but they probably wouldn't show much. You really need to monitor your Zwave logs to see if a device is spamming the network, or possibly freezing up and causing repeating issues so the mesh crashes. It is hard to say until you start looking for things what the cause is. Again, look through existing posts about finding the cause of Zwave crashes.

I had a similar issue, and it was a device that just stopped repeating at times, but apparently it looked to other devices that it was still a valid path to repeat through. Whatever the reason it was causing the crash, replacing the problem device fixed my crash issues. Mine actually was two devices, both Eva Logix in-wall dimmers I had installed at the same time. Both seemed to cause the problem, strangely. It started with just the switches going unresponsive but not crashing the network, and I would get them working with an air gap. Then the crashes started, so I went straight to the known problem switches I had to reset every couple weeks being the cause of the crashes, and it turns out that was it.

Edit: I failed to mention, that when I was looking at Zwave logs for those two devices, they were going totally silent in logging Zwave events before the crashes, so that confirmed to me that they were the cause, and I assumed it was because they stopped repeating but otherwise they looked to be online and just fine.

1 Like

Agreed, and I am trying to figure out the root cause of the crashes. My reason for creating this thread was an attempt to bring attention to the problem with the rapid-file repeating of the z-wave crash events.

Unfortunately, the lack of z-wave log retention makes diagnosing the problem difficult. I have an always-on Raspberry Pi (RPI) -- do you know if there is a way to have the RPI continuously save the z-wave logs?

Marc

There is an endpoint listed for Zwave logs in this list, but http://IP/hub/zwaveLogs in a browser just brings you to the web interface page.

You should be able to leave a log page open in a browser until a crash, or do the logs all go away when your Zwave crashes? It seems as long as you don't reboot the logs would still be there, maybe after a reboot if you have tabs open for each device log.

I would look for strange activity generally from any device, like logs that just always scroll activity constantly and don't pause, even while things are working, to see if something is spamming the mesh, before the mesh gets overwhelmed.

I know that I saw no activity from my problem dimmers, where everything else had logs. I can't remember now if I noticed that before a crash or after, because I was leaving logging on in a browser to try and catch something at the time.

Update: The z-wave logs are not saved. Once you close the window, they are gone. When you open the window, you will only see things in the log when something new happens. There isn't anything equivalent to the "Past logs" tab that you get on the system logs.

I just remembered that I do have an always-on windows machine. I'm going to open a browser window and leave it on the z-wave logs page...

Marc

1 Like

Yes - there is a websocket endpoint and you can write a bash or python script to suck everything down into a file. I have one I could post later if interested however that’s great grunt work for your favourite LLM. ws://your.hub.ip/zwaveLogsocket

I don’t remember seeing this before - I get an occasional zwaveCrashed event after cloud backups on my « old » C7, but never saw rapid sequences like that, implying the zwave stack recovers quickly and then crashes again? what does your location events tab in the logs section look like ?

@bcopeland

If the hub code loses contact with Zipgateway or ZWave-JS, the backend software gets restarted.. It could be a bug in Zipgateway or the chip itself locking up..

Laggy? .. Did you give time for all the re-interviews to complete?

ZWave-JS has much better diagnostics.. And I find it tends to be more stable than Zipgateway..

1 Like

Yes, I gave it 48-hours. After another 2-3 days some of the switches would experience a 2-3 second lag, so I switched back and everything was nearly instantaneous.

This z-wave crash is actually not anything new -- it happens about once every 10-20 days. I set up a rule to reboot the hub once/week and the problem went away. I recently disabled the weekly reboot and it looks like the problem is still there.

Any suggestions?

Marc

1 Like

I don't have much experience with Z-Wave device/mesh issues so I'll let others chime in. However you might want to double-check your power supply.

Good suggestion. I've swapped to another Hubitat power supply I have that has the same specs.

I've also switched to ZWave-JS and have a browser window open capturing z-wave logs.

@bcopeland does keeping that browser window open on the zwave logs create any meaningful extra load on the hub that could cause problems?

Marc

@bcopeland, I just ran a simple test to demonstrate the lag I am seeing while using ZwaveJS, captured the z-wave logs and had chatgpt analyze them. I ran this test 18 hours after making the switch to ZwaveJS. Here is the scenario:

  • I push the top paddle of my ZEN30 (node 41).
  • This triggers a small rule that turns on 2 light switches (nodes 41 and 97).

An analysis of the z-wave logs appears to point to a 580MS time lapse between the hub receiving the top paddle "button" push and the hub sending a command to turn the lights on connected to node 97; 710MS for the lights connected to node 41. I believe that this is why I am experiencing lag times when using ZwaveJS.

There is virtually no perceptible lag when using the legacy z-wave stack. My hub is a C-8 pro running platform version 2.4.3.158.

Can you help me figure out what is causing the lag to increase on ZwaveJS?

Note: I ran the test again 36 hours after making the switch to ZwaveJS and the lag remains.

Thanks for any help you can provide

Marc

Here is the chatgpt analysis of the z-wave logs:

  • Event timestamps (from the log):
    • Button (Node 041) CentralScene received by hub: 2025-12-04 01:59:19.094 PM
    • Hub → Node 097 send: 01:59:19.674 PM
    • Node 097 confirms ON: 01:59:19.725 PM
    • Hub → Node 041 send: 01:59:19.737 PM
    • Node 041 confirms ON: 01:59:19.804 PM
  • End-to-end latencies (button press → device confirmation):
    • Node 097: 631 ms (01:59:19.725 − 01:59:19.094)
    • Node 041: 710 ms (01:59:19.804 − 01:59:19.094)
  • Where the time is spent (breakdown):
    • Hub processing / queuing after receiving button: ~580 ms → ~643 ms
      • Time between hub receiving CentralScene (19.094) and hub issuing the SendDataBridge:
      • to Node 097 = 19.674 − 19.094 = 580 ms
      • to Node 041 = 19.737 − 19.094 = 643 ms
    • This is the largest contributor to the total latency.
    • Z-Wave transmit + device processing: ~51–67 ms
      • Node 097 RTT after hub send = 19.725 − 19.674 = 51 ms
      • Node 041 RTT after hub send = 19.804 − 19.737 = 67 ms
  • Hub serialized sends gap: hub sent to Node 097 at 19.674 and then to Node 041 at 19.737 → 63 ms gap (sequential sends, not simultaneous).

Here is the rule that is being triggered:

Here are the raw z-wave logs:

2025-12-04 01:59:37.531 PM|DRIVER « [Node 098] [REQ] [BridgeApplicationCommand] - │ RSSI: -68 dBm - └─[MultiCommandCCCommandEncapsulation] - └─[MultilevelSensorCCReport] - sensor type: Illuminance - scale: Percentage value - value: 42
2025-12-04 01:59:37.529 PM|CNTRLR [Node 098] [~] [Multilevel Sensor] Illuminance: 9 => 42 [Endpoint 0]
2025-12-04 01:59:37.528 PM|CNTRLR [Node 098] [Multilevel Sensor] Illuminance: metadata updated [Endpoint 0]
2025-12-04 01:59:37.526 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:37.521 PM|SERIAL « 0x011700a80000010062098f010105310503012a00bc007f7f00 (25 bytes)
2025-12-04 01:59:36.174 PM|DRIVER all queues idle
2025-12-04 01:59:36.172 PM|DRIVER « [RES] [GetBackgroundRSSI] - channel 0: -106 dBm - channel 1: -104 dBm - channel 2: -104 dBm - channel 3: -100 dBm
2025-12-04 01:59:36.170 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:36.167 PM|SERIAL « 0x0107013b9698989cc8 (9 bytes)
2025-12-04 01:59:36.165 PM|SERIAL « [ACK] (0x06)
2025-12-04 01:59:36.163 PM|SERIAL » 0x0103003bc7 (5 bytes)
2025-12-04 01:59:36.161 PM|DRIVER one or more queues busy
2025-12-04 01:59:20.846 PM|DRIVER « [Node 041] [REQ] [BridgeApplicationCommand] - │ RSSI: -81 dBm - └─[MultiChannelCCCommandEncapsulation] - │ source: 1 - │ destination: 0 - └─[BinarySwitchCCReport] - current value: true
2025-12-04 01:59:20.844 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:20.840 PM|SERIAL « 0x011500a8000001002907600d01002503ff00af007f7f77 (23 bytes)
2025-12-04 01:59:20.834 PM|DRIVER all queues idle
2025-12-04 01:59:20.832 PM|DRIVER « [Node 041] [REQ] [BridgeApplicationCommand] - │ RSSI: -81 dBm - └─[MultiChannelCCCommandEncapsulation] - │ source: 1 - │ destination: 0 - └─[BinarySwitchCCReport] - current value: true
2025-12-04 01:59:20.829 PM|CNTRLR [Node 041] [~] [Binary Switch] currentValue: true => true [Endpoint 1]
2025-12-04 01:59:20.826 PM|SERIAL « 0x011500a8000001002907600d01002503ff00af007f7f77 (23 bytes)
2025-12-04 01:59:20.819 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:20.816 PM|SERIAL « 0x011d00a97100000100af7f7f7f7f00000300000000030100007f7f7f7f7fea (31 bytes)
2025-12-04 01:59:20.801 PM|DRIVER « [RES] [SendDataBridge] - was sent: true
2025-12-04 01:59:20.799 PM|SERIAL « 0x010401a90152 (6 bytes)
2025-12-04 01:59:20.796 PM|SERIAL « [ACK] (0x06)
2025-12-04 01:59:20.791 PM|DRIVER » [Node 041] [REQ] [SendDataBridge] - │ source node id: 1 - │ transmit options: 0x25 - │ callback id: 113 - └─[MultiChannelCCCommandEncapsulation] - │ source: 0 - │ destination: 1 - └─[BinarySwitchCCGet]
2025-12-04 01:59:20.789 PM|CNTRLR [Node 097] [~] [Binary Switch] currentValue: true => true [Endpoint 1]
2025-12-04 01:59:20.785 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:20.782 PM|SERIAL « 0x011500a8000001006107600d01002503ff00c0007f7f50 (23 bytes)
2025-12-04 01:59:20.774 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:20.770 PM|SERIAL « 0x011d00a97000000100bf7f7f7f7f00000300000000030100007f7f7f7f7ffb (31 bytes)
2025-12-04 01:59:20.757 PM|SERIAL « 0x010401a90152 (6 bytes)
2025-12-04 01:59:20.753 PM|SERIAL « [ACK] (0x06)
2025-12-04 01:59:20.749 PM|SERIAL » 0x011400a90001006106600d000125022500000000703a (22 bytes)
2025-12-04 01:59:20.745 PM|DRIVER one or more queues busy
2025-12-04 01:59:20.720 PM|DRIVER « [Node 097] [REQ] [BridgeApplicationCommand] - │ RSSI: -65 dBm - └─[MultiChannelCCCommandEncapsulation] - │ source: 1 - │ destination: 0 - └─[BinarySwitchCCReport] - current value: true
2025-12-04 01:59:20.717 PM|CNTRLR [Node 097] [~] [Binary Switch] currentValue: true => true [Endpoint 1]
2025-12-04 01:59:20.714 PM|SERIAL « 0x011500a8000001006107600d01002503ff00bf007f7f2f (23 bytes)
2025-12-04 01:59:20.646 PM|DRIVER all queues idle
2025-12-04 01:59:20.645 PM|CNTRLR [Node 041] [~] [Multilevel Switch] targetValue: 0 => 99 [Endpoint 0]
2025-12-04 01:59:20.642 PM|CNTRLR [Node 041] [~] [Multilevel Switch] duration: {"value":0,"unit":"s [Endpoint 0] - econds"} => {"value":0,"unit":"seconds"}
2025-12-04 01:59:20.641 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:20.637 PM|SERIAL « 0x011300a8000001002905260363630000af007f7fe3 (21 bytes)
2025-12-04 01:59:20.630 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:20.627 PM|SERIAL « 0x011d00a96f00000100af7f7f7f7f00000300000000030100007f7f7f7f7ff4 (31 bytes)
2025-12-04 01:59:20.615 PM|DRIVER « [RES] [SendDataBridge] - was sent: true
2025-12-04 01:59:20.611 PM|SERIAL « 0x010401a90152 (6 bytes)
2025-12-04 01:59:20.609 PM|SERIAL « [ACK] (0x06)
2025-12-04 01:59:20.605 PM|SERIAL » 0x011000a90001002902260225000000006f02 (18 bytes)
2025-12-04 01:59:20.597 PM|DRIVER « [Node 097] [REQ] [BridgeApplicationCommand] - │ RSSI: -64 dBm - └─[MultilevelSwitchCCReport] - current value: 99 - target value: 99 - duration: 0s
2025-12-04 01:59:20.594 PM|CNTRLR [Node 097] [~] [Multilevel Switch] targetValue: 99 => 99 [Endpoint 0]
2025-12-04 01:59:20.592 PM|CNTRLR [Node 097] [~] [Multilevel Switch] duration: {"value":0,"unit":"s [Endpoint 0] - econds"} => {"value":0,"unit":"seconds"}
2025-12-04 01:59:20.587 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:20.583 PM|SERIAL « 0x011300a8000001006105260363630000c0007f7fc4 (21 bytes)
2025-12-04 01:59:20.578 PM|DRIVER « [REQ] [SendDataBridge] - callback id: 110 - transmit status: OK, took 10 ms - routing attempts: 1 - protocol & route speed: Z-Wave, 100 kbit/s - routing scheme: LWR - ACK RSSI: -65 dBm - ACK channel no.: 0 - TX channel no.: 0
2025-12-04 01:59:20.574 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:20.571 PM|SERIAL « 0x011d00a96e00000100bf7f7f7f7f00000300000000030100007f7f7f7f7fe5 (31 bytes)
2025-12-04 01:59:20.563 PM|DRIVER « [RES] [SendDataBridge] - was sent: true
2025-12-04 01:59:20.558 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:20.556 PM|SERIAL « 0x010401a90152 (6 bytes)
2025-12-04 01:59:20.552 PM|SERIAL « [ACK] (0x06)
2025-12-04 01:59:20.548 PM|SERIAL » 0x011000a90001006102260225000000006e4b (18 bytes)
2025-12-04 01:59:20.545 PM|DRIVER » [Node 097] [REQ] [SendDataBridge] - │ source node id: 1 - │ transmit options: 0x25 - │ callback id: 110 - └─[MultilevelSwitchCCGet]
2025-12-04 01:59:20.544 PM|DRIVER one or more queues busy
2025-12-04 01:59:19.804 PM|CNTRLR [Node 041] [~] [Binary Switch] currentValue: false => true [Endpoint 1]
2025-12-04 01:59:19.801 PM|DRIVER « [Node 041] [REQ] [BridgeApplicationCommand] - │ RSSI: -81 dBm - └─[MultiChannelCCCommandEncapsulation] - │ source: 1 - │ destination: 0 - └─[SupervisionCCReport] - session id: 33 - more updates follow: false - status: Success - duration: 0s
2025-12-04 01:59:19.798 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:19.796 PM|SERIAL « 0x011700a8000001002909600d01006c0221ff0000af007f7f12 (25 bytes)
2025-12-04 01:59:19.791 PM|DRIVER « [REQ] [SendDataBridge] - callback id: 109 - transmit status: OK, took 10 ms - routing attempts: 1 - protocol & route speed: Z-Wave, 100 kbit/s - routing scheme: LWR - ACK RSSI: -79 dBm - ACK channel no.: 0 - TX channel no.: 0
2025-12-04 01:59:19.785 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:19.781 PM|SERIAL « 0x011d00a96d00000100b17f7f7f7f00000300000000030100007f7f7f7f7fe8 (31 bytes)
2025-12-04 01:59:19.778 PM|DRIVER « [RES] [SendDataBridge] - was sent: true
2025-12-04 01:59:19.775 PM|SERIAL « 0x010401a90152 (6 bytes)
2025-12-04 01:59:19.772 PM|SERIAL « [ACK] (0x06)
2025-12-04 01:59:19.737 PM|SERIAL » 0x011a00a9000100290c600d00016c01a1042501ffff25000000006da0 (28 bytes)
2025-12-04 01:59:19.734 PM|DRIVER » [Node 041] [REQ] [SendDataBridge] - │ source node id: 1 - │ transmit options: 0x25 - │ callback id: 109 - └─[MultiChannelCCCommandEncapsulation] - │ source: 0 - │ destination: 1 - └─[SupervisionCCGet] - │ session id: 33 - │ request updates: true - └─[BinarySwitchCCSet] - target value: true
2025-12-04 01:59:19.731 PM|CNTRLR [Node 097] [~] [Binary Switch] currentValue: false => true [Endpoint 1]
2025-12-04 01:59:19.725 PM|DRIVER « [Node 097] [REQ] [BridgeApplicationCommand] - │ RSSI: -67 dBm - └─[MultiChannelCCCommandEncapsulation] - │ source: 1 - │ destination: 0 - └─[SupervisionCCReport] - session id: 48 - more updates follow: false - status: Success - duration: 0s
2025-12-04 01:59:19.721 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:19.719 PM|SERIAL « 0x011700a8000001006109600d01006c0230ff0000bd007f7f59 (25 bytes)
2025-12-04 01:59:19.703 PM|SERIAL « 0x011d00a96c00000100bd7f7f7f7f00000300000000030100007f7f7f7f7fe5 (31 bytes)
2025-12-04 01:59:19.687 PM|DRIVER « [RES] [SendDataBridge] - was sent: true
2025-12-04 01:59:19.685 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:19.682 PM|SERIAL « 0x010401a90152 (6 bytes)
2025-12-04 01:59:19.677 PM|SERIAL » 0x011a00a9000100610c600d00016c01b0042501ffff25000000006cf8 (28 bytes)
2025-12-04 01:59:19.674 PM|DRIVER » [Node 097] [REQ] [SendDataBridge] - │ source node id: 1 - │ transmit options: 0x25 - │ callback id: 108 - └─[MultiChannelCCCommandEncapsulation] - │ source: 0 - │ destination: 1 - └─[SupervisionCCGet] - │ session id: 48 - │ request updates: true - └─[BinarySwitchCCSet] - target value: true
2025-12-04 01:59:19.671 PM|DRIVER one or more queues busy
2025-12-04 01:59:19.094 PM|DRIVER « [Node 041] [REQ] [BridgeApplicationCommand] - │ RSSI: -81 dBm - └─[CentralSceneCCNotification] - sequence number: 47 - key attribute: KeyPressed - scene number: 1
2025-12-04 01:59:19.089 PM|SERIAL » [ACK] (0x06)
2025-12-04 01:59:19.086 PM|SERIAL « 0x011300a80000010029055b032f800100af007f7f30 (21 bytes)