C8 on 2.3.9.193, connected via ethernet, DHCP reservation on router. Light was green.
This morning my C8 became unreachable. Not available in app, or directly via web, or even diagnostic tool. I pulled the power, waited a minute, and repowered. The hub booted and came back up and was once more available. I got a couple of "hub load" messages and rebooted. Since then it seems to have settled down.
There's nothing in the logs that shows what might have happened. There's just a gap of about an hour while the hub was down.
I would try going to the Network Setup and under the Bonjour section, turn off the automatic restarts. There is a theory this may help with the disconnects. Also if the hub is up for a long time these mDNS restarts start to build up a large cache (or something) and the amount of packets the hub sends out increases every 20 minutes when it restarts. So it may eventually reach a point where it crashes the ethernet port? Just a theory anyway but worth a try.
It was last rebooted 5 days ago. At the very least it gets rebooted on the 1st of every month, and usually more often.
I turned off the automatic Bonjour restarts. Thanks for the suggestion. Crossing my fingers it helps. So far my HomeKit integration has been really solid, so I hope this doesn't change that.
At around 5 days I had found the traffic coming from the hub every 20 mins was enough to lag RDP sessions and also lag my audio on web meetings. This was without the HK integration installed. The traffic was amplified a lot if HK was running on the hub and grows exponentially for some reason. Maybe not enough to lock up the hub in only 5 days but could definitely cause some network disruptions. We are talking thousands of packets being sent from the hub in 1 second, every 20 minutes. Turning that restart off makes it stop.
Reviving this thread because it happened again last night. I knew something was not right when I got up in the middle of the night and my nightlight didn't turn on. Any help anyone can offer with this is greatly appreciated.
C-8 on 2.3.9.201. Last rebooted at update 11/13/2024 3:44 pm.
Bonjour restarts is off.
Usually when I do an update I do an extra reboot afterwards. I didn't do that this time because I was busy. I don't know if this is significant or not.
Hub "disappeared" at 2:02 AM. The logs show no errors or warnings. Just a 6-hour gap.
This morning I was able to use the diagnostic tool to reach the hub and reboot it. It rebooted at 8:01 AM and has been fine since.
The hub is set to backup at 2:01 AM. Local update nightly, cloud backup every three days. Both backups ran just before the hub stopped responding. I notice that the local backup ran twice. I don't know why or if any of this is significant.
Check your Logs > App/Device stats and look for any excessively large "states". There was a case recently where some devices with a huge state history was suspected of causing the platform to crash / lock up. I think after they dialed it back the lock ups stopped.
Check the status of your devices in the Z wave table. See if the devices indicate they have a route to the hub or if they even show that they are in the table.
@user837 All of the zwave devices have routes. No ghosts. I haven't added any new devices for quite a while.
@jtp10181 There were a couple of things with large-ish states that I'm not using. I've deleted those. I also found some weirdness going on in CoCoHue, and I think that's taken care of. I rebooted and rebuilt the database for good measure. There have been a couple of spikes in hub load since then, and I can't see what's causing them. When I look at the device and app logs nothing stands out. It seems to have settled down now so I'm keeping my fingers crossed.
Thanks for checking. I have had the same issue twice now after automatic backups. In my case zwave entries were completely gone after an automatic backup. Fortunately for me the restore did work. For now; pending the forthcoming fix(es), I have disabled automatic backups.
If I can think of anything else I will let you know.
Sounds like the backups are a part of the cause, not really clear why that is..
But I would turn off the automatic backups, for like 2-3 weeks, and do a manual backup if/when you make any important changes - And see if that makes any major impacts. - If all is well, after that time frame, then re-enable the local backups, and run again for another 2-3 weeks..
Your note the only one that seems to have issues with cloud backups, so I would pursue that path to try and isolate & quantify this. - But the interaction of cloud disconnects, and cloud backups, can't be good... - I totally agree with turning off the HK restarts, given the know mDNS issues.
I think that might be a coincidence and not actually a cause. Both backups ran last night with barely a blip in load. I might try changing the time, though.
One thing I did realize as I was poking around yesterday... when I set the C-8 up I migrated from my C-5. The C-5 ethernet was connected directly to the router. I plugged the C-8 into a splitter, and after I put the C-5 away I never moved the C-8. I'm wondering if something was going on with the splitter. I moved the C-8 to the router, and we'll see if that makes a difference. I can see that this might have caused some of the cloud disconnects. So maybe it got dropped from the network and just never reconnected successfully?
I'll report back if anything happens. But it was 4 weeks between the last occurrence and this one, so it's obviously not something that happens very frequently. If it was it would be a lot easier to troubleshoot. For now I'm keeping my fingers crossed.
All you needed to do was shut down the hub and unplug it for 30 seconds to restart the z-wave radio in this case. Restore was most likely not necessary.
I did try to wait [for about 45 minutes] but maybe I was just impatient. Some devices were still pending and I began to get spooked Also the hub mesh was definitely not syncing up either so I shut the 2nd hub down thinking I would try to avoid "bad things" what ever that might be propagating over to that hub.
The cloud restore always seems to go well the few times I have had to use it so once again I was fortunate it did so again.
Learning opportunity for me. How long should I have waited for the zwave local table to rebuild? Mesh to rebuild?
Thats not what I said at all, all you needed to do was do a SHUT DOWN of the hub. Unplug the hub for 30 seconds and plug it back in. That restarts the z-wave radio and should bring everything back if it goes down due to a backup failure.
Its a known issue when the z-wave backup fails and the hub actually tells you to do this procedure on the backup page after a failure. They have be working on correcting it and getting z-wave to recover on its own but I think it can still happen occasionally.
Well, the things I tried above apparently didn't help. I came back from walking the dog to a cold, dark house where nothing was working as it should. As before the light on the hub was green, but the hub was completely unreachable.
I'm completely flummoxed and open to any suggestions. This is happening more often and is getting really annoying. I'd open a ticket with support, but there doesn't appear to be a way to do that any more. So here I am.
I only saw one weird thing in the logs that I can't explain. In the logs there's this extraneous line that doesn't have a device label associated with it and seems to be something going on with the Hue Bridge via CoCoHue. (Note the motion sensor in the line right above is not connected to the Hue Bridge. It's direct to the hub.)
There are no other log entries for the Hue Bridge except a group of the usual " issue parsing JSON array" errors that happened yesterday. That's not unusual and doesn't appear to impact anything as far as I can tell. I asked in the CoCoHue thread and that seems to be the consensus.
I make graphs of stats and nothing seems weird there. The spikes happen after the hub reboots. That happens every time the hub reboots, so not just because I pulled the plug for a minute.
Sorry missed that reply.
Do you know about what update it started having issues?
If it was anywhere in 2.3.9 you could roll back to 2.3.8 and try that for a while. Only problem will be if you created any EZ Dashboards those will get removed, and Z-wave LR wont work in 2.3.8.
I'm not sure I want to roll back that far, and I'm not sure at which update it started happening. It's so sporadic that it's hard to pin down.
I remembered that I actually have a Hub Protect subscription that comes with an extended warranty, so I put in a ticket. Maybe they can spot something in the engineering logs or ?
I know my hub is not the only one having this problem.
I thought of one other thing although it usually only impacts Z-wave. What is your power source for the hub? No matter what it is, I would try switching to a different standalone power block, USB-A, at least 1A output. Power issues can cause the z-wave radio to go bonkers. I was having z-wave only lock up issues a few months ago and swapped my supply out. Problem never came back after that.