Another hub lockup

Oh... and a freebie bug report for you, Chuck. Detach the z-wave stick from the hub... and the system message that comes up informs me that ZigBee isn't working. Think you got that the wrong way around :wink:

-- Jules

Actually... it's reporting the Zigbee radio is down properly, and not saying anything about z-wave. That's more than just picking the wrong string - not sure why that would happen. Anyway... probably not important. You can probably safely ignore that bug report - unplugging the z-wave stick isn't really something you're supposed to do in normal operation I suspect :wink:

-- Jules

Not to derail the topic here but how does support diagnose a problem like this? Are they able to log into hubs remotely? Do they have a black bag of diagnostics tools they can run on hubs? If they don't allow any reasonable diagnostics tools for the user that puts all the load on them if they indeed have the means to view logs we cannot. Its not like seeing diagnostics such as CPU and memory load would have a negative impact.

They usually just look at logs/what is installed.

Per previous comments by Hubitat staff - no, they don't.

1 Like

For the record and other's use:

  • I removed Chromecast app and deleted all devices it created
  • I have not experienced the constant slowdowns of the past after removing them and for two days in a row I have nothing but "Info" items in my historic logs

I am now working to identify any other custom code that I had previously stopped using/removed and adding it back one at a time giving 2-3 days in between as that seems to be the maximum time until a lockup prior.

For the HE staff, if you would like more information on my Chromecast setup let me know, I have 2 mini's and a home Hub.

hope it helps someone-

Chris

4 Likes

With the hub update 2.1.0, I decided to re-do all of my Rule Machine automations, but before that happened, my hub locked up last Friday - the first time ever for me.

With no clear changes that might have led to this, I removed a few apps that I was testing out, and even though I have very very little in the way of "fancy" apps loaded anymore, I experienced another complete lock up yesterday.

In both cases, by the time I realized the hub was locked up, a bunch of my ZigBee devices had dropped their connection, so I have to assume the hub's ZigBee network also stopped working altogether.

Before contacting support, I was hoping to try performing a "soft" Hub Reset as explained by @raidflex here:

...however I got a page with a single text entry field asking for a "Login", and my Hubitat username / password are not accepted.

So, I am going to disable the only couple of things that could be suspect and see if the hub locks up again.

Perhaps there's something I'm missing, but I've only experienced slowness in past which I resolved by removing the "offending" apps, and I have never had the hub lock up before the most recent update.

Scratch that. Hub locked up this morning, red light, :8081 was not available.

Rebooted, no "Errors" in the historical logs. Opened support call.

Just got this from support:

Hi there, sorry for the troubles. Our engineers have identified the issue and a fix is coming with next update.

So apparently whatever caused my last lockup is some sort of bug.

I look forward to seeing the description of the bug/fix for your specific issue in the release notes of the next release then!

The more we understand what the issues are, the more we can help ourselves - which is good for both users AND Hubitat.

1 Like

Good news. Always great when something is identified.

as a follow up, I asked if I could prevent the known issue in some way and got this:

Sorry for the troubles. There is no workaround for this problem. Enabling more custom devices it will only cause your hub to go unresponsive quicker. I would hold off changing anything until the next update rolls out, which shouldn't be far away. However, I do not have a time frame when it will happen.

So may be something in handling custom device drivers etc. Not clear.

4 Likes

I don't want you guys to go thinking that everybody's hubs will be fixed with the next release. We found a very specific bug that was reproducible. If you click on an event in the event list page, it will attempt to open an event details page. Currently that page will blow up, there is a loop that the hub gets into loading the page and causes an out of memory error. This is what support is referring to having been fixed in the next release.

The other issue that people have been experiencing with hub lockups is still our highest priority. I have been working with multiple people to attempt to get to the bottom of it. While we have many promising leads we do not yet have a fix.

19 Likes

An update on my particular issue with extreme slowing. With Chromecast Integration AND Nest Integration disabled, the issue was gone. Two things at once, not a good troubleshooting method.

So after re-enabling the Nest Integration, I had very slow hub performance within a few hours. I disabled the Nest Integration and rebooted. Things were fine for a few days. I re-enabled the Chromecast Integration and left the Nest Integration disabled. After a few more days, everything was still fine.

Nest Integration is deprecated anyway, and Iโ€™m only using it for turning on lights in an emergency. So goodbye Nest Integration, and Iโ€™ve switched back to IFTTT for that. All is well on my hub.

1 Like

Had the same thing and it also seemed to fix mine but i then used the port of NST manager and have not had a lock up since, so if you still want a integration that's the way i would do it :slight_smile:

2 Likes

Meh, itโ€™s cloud anyway, no matter what Integration you use. I figure this another stop gap until Google lets everyone in on their big plans for the new integration method.

1 Like

Should my "grandfathered in" Nest integration ever fail, I will probably sell my Nest and replace it with a basic Zwave t-stat.

Bummer, I don't have nest integration. Was really looking forward to the fix, but based on the description thats not what i'm experiencing.

Yes, that is why I phrased it "my particular issue" so as not to imply that I've discovered something that is global. In fact, I'm going to guess (and probably won't be too far off base) that's probably what most people are dealing with. Either corrupt databases for some reason (possibly pulling the power without proper shutdown) or custom code that is conflicting with some other code or the platform itself.

2 Likes

I would lean that direction as well, as you would think statistically if it was a hardware issue that at least one of my 3 hubs would experience what others have reported, which I have not. But I also have never pulled the power cord on any of the hubs without a proper shutdown first, (as well as non-recommended periodic reboots I also perform)

1 Like

Thank you for posting the details Chuck, makes it easier to understand and set expectations for what may or may not be fixed in the next release. I suspect because I'm still doing so much work within my hub I have hit it several times lately accidentally.

The last few days I've been too busy to muck about, things just ran on their own with the automations in place and all is well so far. :slight_smile: