C7 Hub Locks Up All of a sudden

Hi all and @bobbyD

My hub, C-7 (2.3.3.140) has been running smoothly since I received it at launch. I have solid ZW mesh, low CPU usage, low app and device app usage and clean logs. My secondary hub handles all my wi-fi/cloud apps (connected with Hubmesh) and it's running fine also; this hub has all my ZW and Zig devices/rules/dashboards. No changes were made in the last few days and all of sudden it's locked up (with a green light, no automations working and can't even 'ping') twice in the last 12hours. A power cycle is all the recovers it. Any ideas on where/what to look for? It's tough when the logs are clean and everything looks normal.

Thanks

1 Like

I imagine the first recommendation would be to download a backup and restore it which will clean the database. Even though it wasn't the initial cause of your issue, the power cycle without a controlled shutdown could corrupt the database. the backup/restore will sort that out.

1 Like

Yes, already on that, thanks.

While I'm at it, after the restore I'm updating to the latest FW.

2 Likes

there is bad weather abound.. it could be power fluctuations.. do you have it on an ups?

Yes sir. Both hubs and all network equip are on a UPS.

2 Likes

Knock on wood the hub's been stable since the restore and FW upgrade, but it's concerning to me that without warning or reason (in the logs) the hub can just lock up with a green light. Has this happened to anyone else? If so then did you ever find a reason?

Well, sadly, the hub stayed up for about 4 days and crashed again. Any help, @bobbyD, would be appreciated.

Here's a full description.
My hub has been working great for over 2 years and all of sudden I've been experiencing hub lock up or crashes; 3 in the last 6 days. The symptom is a green light but the unit is unresponsive and not even pingable; the only recourse is a power cycle. I have a 2 hub setup, all my ZW and Zigbee are on this hub (my primary hub with all my rules and dashboards so it's critical) and all my network and cloud devices are on my secondary hub that are shared via hub mesh. Both hubs are connected to a node-red via Maker API. No major changes have been made recently, the last thing I did was to remove 4 hub mesh devices and replace them with virtual devices. I was using the Tuya community app to control some RGBW lights via the cloud on my secondary hub but found a local option via Node-red so I carefully removed the actual cloud devices and replaced them with virtual devices shared to node-red via Maker API. When I do the removal I do it systematically, I remove all links to the devices first before I remove the actual device, then remove the device. This change was done on the 20th and the hub was running fine for days but on the 23rd it locked up then after a power cycle it locked up again about 12hrs later. After that power cycle I restored an old db from the 20th (right after I did the device change) and upgraded to 2.3.4.123 as I was running the latest 2.3.3 version. After that it was running fine until this morning when it locked up again. In each case the logs are clean. App and device stats have CPU usage at <1.0%, I have the hub info driver and my free memory hasn't dropped below ~274000. The unit is on, and has been on, a UPS. There have been no changes to my network in that time either. I have a slightly above average network layout, one 24port switch connecting 75% of the devices and a unifi wifi setup to handle the remaining 25% all on a flat 192.168.1.xxx. All my devices have reserved IP addresses so there is no chance of an IP addr change and I have an automated hub reboot performed each week.

As I strive to keep a clean, low stress system I'm at a loss as to what's going on. Is there anything in the engineering logs that you can look at to help or anything else? I'll be happy to send any additional info, please let me know.
Thanks

Any chance you enabled jumbo frames on your network recently? That's a known issue that can cause symptoms like the ones you report.

No, @brad5, that's what makes this so hard to troubleshoot. No changes, of any kind, were made by me for a few days before this and the logs are clear as well as the app stats, dev stats, etc. I specifically didn't want to muck with anything because of the holidays. Could there have been an auto update of a device somewhere, sure, but none that I initiated.
What's also odd is that I have all the network devices on a switch which would mean only a jumbo packet destined for the hub would get to the hub. But all my network devices, save for a couple phones and OwnTracks (for presence), are serviced by my secondary hub which has been rock solid.

Maybe you have a faulty power supply. Try changing it out to see if that resolves the issue. Any 1A USB power brick will work.

2 Likes

Locked up again after about 6 hrs, no warning, logs clean, etc...

Been running on the same supply for 2 years but it's worth a shot, thanks.

Edit: did a controlled shutdown and installed a 2A supply I had, so we'll see.
Edit2: double checked the router settings and the MTU is set at 1500. So no jumbo packets should be seen by the hub

4 Likes

@wecoyote5 curious if the change in power supply solved your issue?

1 Like

Thanks for checking in (there is another 'freezing' thread that I've been monitoring and updating on and I forgot about this one). As it stands right now, knock on wood, the hub has been up w/o crashing since replacing the power supply. I'm only cautiously optimistic since a power supply failure (especially an intermittent one) is such a low percentage, but fingers crossed, that was it. If it goes a month then I'll feel more comfortable.

Update EDIT: I know I'm jinxing myself but, knock on wood, the hub has been stable since the power supply change. Now, full disclosure, I did remove some ethernet based apps during this process but I did have a lock up so I ruled them out. The last change I made was the power supply and it's been stable since then, 3+ weeks now.

4 Likes

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.