Its my turn. Hub died today

Woke up this morning to a dead hub (happend around 2am). The admin console was not responding but I could get to port 8081. Last week it was the zigbee network going offline for no reason.

Had to reboot by pulling the plug. It would sometimes stick at 45% and sometimes at 65%. I've left it for over an hour and it just sits there. I've even pulled the usb and rebooted with no luck.

Tried downgrading the firmware, and it has the same issue.

So looks like database is probably corrupted. I have backups taken multiple times a day on my server so I'm not to worried about that. Will probably have to do a reset and restore if possible.

Waiting on support to reply but anybody have any suggestions?

Curious.... I'm starting to question the quality of the internal memory chip. I wonder if they are having premature failures? This might explain why hubs are just going dead without any (known to us) cause???

Do a soft reset and then restore from backup:

4 Likes

Going to try that now.

Back in business. Didn't take long at all. Bookmarked for future reference.

Now upgrading back to the latest firmware and I'm back at how it was last night at 10pm.

Thank you.

Mine choked for the first time today as well. I have noooo idea what it was doing but it took 30 minutes to get to the settings page to perform a 'graceful' reboot. It all started around 0500AM EST. No devices were responding. I had a few zigbee devices dropoff overnight as well, including my Pearl which SUCKED! WAS SCAAAARY!!! It has been OK since then...KNOCK ON WOOD!

I have to admit. At first I was upset when this happened...

but after going through the process of a soft resent and restoring of the database, I appreciate the HE platform a heck of a lot more.

It took me 10 mins once I knew what to do and I was backup with 0 data loss.

I consider this a big plus!

1 Like

At least you were able to boot back up. So the next step if it happens again is to see what app may be hogging the CPU. Sometimes I have my logs on overnight on my computer. That helps me see what may have happened and at what time.

1 Like

I hear that! The kicker is I don't have any Apps running other than the HE built-in RM's and such. Nothing new added in any way for a bit either.

I've had some corrupted rules in RM that caused issues (recently with Hub Link) - not that serious of course. Usually it's because I change the rule conditions and forget to remove/replace the related rule definition. It can be a little picky...

I'll definitely take a look-see at some of my recent additions. LORD KNOWS I've been getting more and more creative with RM as time goes on.

Can you look at any device groups you may have and see if they have old incomplete ghost RM rules? I believe that I identified this happening to me, but haven’t had issues.

Will take screenshots and post a new thread when I get home.

is there a way to monitor cpu usage or how heavily loaded a hub is?

No. Can't see CPU, memory, I/O loading, or storage used.

Again, we need better monitoring tools. Just think how many support tickets would be prevented if developers had the ability to properly troubleshoot their own running apps. Common sense if you ask me.

2 Likes