Hub just crashed again, twice in a week now

Couldn't access any page so went to look and found RED led on the HUB. Pulled power to reboot the hub for now. Not sure what's going on but happened twice now in a week. Only browser connected was pointing to the zigbee logging page for awhile.

Tagging @bobbyD

Update: Rebooting is taking forever, 10 minutes now and it's sitting at only 70%. Must be restoring a db due to corruption?

Update 2: still stuck at 70% after 20 minutes.. hmm

Update3: Ok 30 mins now and still stuck at 70% ..:pray:

Gave up and pulled power again now the HUB is back up and running.

After the hub rebooted, my zwave devices was going crazy. No zwave device would respond, lights would randomly turn on and off. My stairway lights was turning on and off in rapid succession. Seems like commands were being que up so when the zwave network started working again, it went crazy to empty out the que. It did this after I disable the zwave and re-enabled it again.

I concur with @cuboy29. My hub took a dumpola tonight as well. Nothing was responding, house was hot, hub light was red.
Manual reboot wasn't slow for me.

Add me to this list, it looks like it reboots, but I can’t connect.

Down hard... contacted support.

Hub turns on and solid blue LED immediately. All my keen vents are blinking no connection.

I have two HE hubs running. Neither has crashed, locked up, or rebooted over the past 6 months (except when I wrote some some bad groovy code!)

Perhaps there is a common App that you are all using? WebCoRE has been known to cause some issues in the past.

3 Likes

I was crashing once a day...until...I turned off the nest integrations....

I know staff weighing in on this may appear biased, however I'm with Dan on this, two hubs, and I run pre release platform software as well as development apps and drivers on the both of them and I honestly can't remember the last time a hub of mine crashed...

I would suggest that any of you that are having issues, open a live logging window, and leave it open...
Then go through them looking for Java and or Groovy errors, identify, fix and or eliminate these as a starting point.

Jumping on a thread with me too and no data points really doesn't help to find the issues.

If it's someone else's code and or driver, notify the author and or fix it yourself, if you aren't able to do either of those, then remove the code!

If you see Java and or Groovy errors being generated by HE built in apps or drivers, please notify support with the relevant, so we can resolve these.

I can absolutely tell you that running stock apps and drivers will not crash your hub, there are some bugs out there still that may effect performance sure, but there's no built in code out there that I am aware of (and I use and or have tested quite a bit of it) that will crash your hub....

Bad code can crash computers and gank databases, there's nothing new here, this is not a surprise.

Having access and the ability to publish custom code comes with a responsibility as well.

An an additional tip, if you're installing a new app or driver (ie one that you have no prior expierence with), open a live logging window before hand.
That is the perfect time to see how it behaves while it's being initialized and setup.

4 Likes

we know there are issues with nest cameras currently.

Kinda hard to have data points when it goes down hard on the middle of the night and I was not connected with logging at the time...

I contacted support and will go through that.

1 Like

sure understood, but I bet there was plenty going on in the live logs that preceded that event.

That's why I'm offering suggestions on how to head it off before its too late.

But I have no logs. I do not keep logging open so there is no system drain.
I am looking for a way to unbrick right now.

1 Like

I also have 2 hubs, one for development, and my “production” hub has been very stable. Since February I have only experienced one lock up and that was my own fault with bad code. I have many custom apps for alarm integration, managing my hot water circulation pump, etc but I try to use stock apps and RM where possible. I don’t use Webcore, never did in ST either.

Sure I have had slow downs especially since I had many older GE first gen Zwave devices where my mesh was pretty unstable. I had RM rules in place running automatic refreshes on several periodic intervals. I have since replaced my main switches with Plus models and also put conditions around my refresh rules so I only execute them if motion is detected in an area or my patio doors open versus all the time. These changes have helped tremendously and things are very fast and stable.

As Mike suggests, it’s important to keep live logging open after you install a custom app or driver to see if there are any errors. You don’t need to indefinitely but enough time to allow testing of all the features of that code to ensure no errors. Those errors can lead to memory or CPU issues and cause the hub to crash like any other computer you use on a daily basis.

@bobbyD has access to additional logging that we consumers don’t and can possibly help pin point your issues.

1 Like

Thanks guys, @bobbyD looked and found that a LAN device was flooding the network. I am not sure why this would be the case, I have this Sonoff device on the Network for awhile now. Infact, I have 4 totals on the network. I was also messing around with a zwave driver and left it 1/2 completed so there were some zwave related errors.

For now, i'll just remove all the Sonoff stuffs sine i have plenty of zigbee plugs due to the recent Lowes shopping spree.

2 Likes

I keep a Logs tab open pretty much all of the time, and there is no noticeable "system drain" from doing so (large system, 220+ devices, 100+ automations, active development and testing going on, etc.).

Virtually every case we've seen of hub crashes and slowdowns, where we have dived into the logs and explored the causes, have come from bad custom drivers and apps.

This is the sort of thing we see over and over.

If your hub is slowing down, crashing, going red LED, etc., odds are you have some bad custom app or driver bringing it down. Logs are the easiest way to find these problems. There should be NO errors in your logs. If you see red there, investigate the source.

1 Like

I wish we can see the same logs as Bobby, my hub was working fine, then it stopped working with no changes, Bobby said some nodes were the problem, but I had removed those nodes already, then an app that sends a command and the device rejects it as "unknown method" for a node that was my siren, the only app the siren had was HSM, I could figure out this if I had those hidden logs available. Still today I saw a slow down again in my Z Wave mesh, honestly I have my doubts if this Nortek usb stick is the bottleneck here...

I doubt that very much...

99.9% of all driver/application errors as well as JAVA and Groovy platform errors that are generated from app/driver issues are exposed in the live logging along with the offending line number.

Once you run live logging, and see no errors, that's the point we might have a look at your log files.

1 Like

I am in the same situation as Dan,
Two hubs - one production, one test, with lots of crappy part drivers and apps.
Never had a lockup on either one!

I think I’ve been really lucky with the test hub :slight_smile:

Andy

2 Likes

I'm sure you doubt that very much because I'm not Bobby, but is not only this my friends, I feel like every time we post a problem it is user or end device problem, not a hub problem, I don't like to do this but I will have to so I can explain better my situation. I hope nobody get offended or feels like I'm rude because I'm not.

I had all my system installed in ST, everything worked perfect, the only issue was when I saw this company to start in February I realized that cloud dependencies are not good for security, I was using ST and Webcore for my home security, I saw a few cloud problems in the past so I had already started with Home Assistant and migrated some devices there and it worked perfect too, but no cloud dependance.

Here is my point, why you, I mean Hubitat, blame GE Z wave switches that they don't update the status? I never had that issue with ST or Hassio, I removed 4 GE dimmers because Hubitat can't receive the correct status, but those dimmer were working fine in Hassio, they were working fine in ST. Why I have to invest more (getting new switches and dimmers)because Hubitat can't fix the simple issue of updating the status? I know Hubitat has more processing power than ST and a Raspberry Pi 3 model B with Hassio, then why they can and Hubitat can't?

I'm very frustrated now, I did not get your hardware on February because I knew this will happen, I patiently waited until July, and as my birthday present I got Hubitat, what present I got...

I have a small home, I'm a medium class person, in the edge of a poor person, I can't spend more effort and money on this, I just posting my problems to get them resolved and your future as a company will be better, less problems with other customers like me, more referrals, but just shooting me that it's my fault, that only makes me wish to uninstall everything.

Thanks

1 Like

Maybe there should be mechanism in place to prevent a single device from trashing the zwave network. I am not sure that's possible since hubitat or not. Another option is maybe to blacklist a device/IP that is flooding the HUB. Let the user know something is causing the hub to slow down. It would good for user to get a sense of what's going on with their hub as oppose to waiting for support to look at some logs that we don't have access to. Speaking of that, what can support do/access to our HUB anyway? Is this some kind of backdoor?

1 Like

I was thinking the same..

1 Like

So to keep my system stable and watch it for issues I need to keep my computer on all the time? Thank you, but no...
I run stock except for 2 devices and 3 apps. They have been solid for months. I haven't changed anything since the last few update, and before that it was the fw update before that.

1 Like