Trying to Debug Hub Lockups

@Ryan780 - aside from webCoRE, what other custom apps are you running? Any custom drivers? I am trying to track down a similar problem and would like to see if we have anything in common.

Well, i have 6 Hubduino boards. And every time I get a lockup I'm still having to reset all of them by power-cycling them. Not many custom drivers and none of the devices that use any of them were doing anything at the time (custom LED light strip notifications stuff like that). I'm also running Device Monitor, Cast Web, Ecobee Suite, Event Ghost, MyQ lite, WebOIPi (for one relay). That's it.

The oddest thing is that this last one happened when literally nothing was going on in the house. I had been away for several days at this point, so no motion activity or whatnot. No scheduled activity was taking place or took place for almost 11 hour prior (away lamp turned off the night before) or scheduled to take place for 10 hours later. It really couldn't have been more quiet which is so strange. If a lot had been going on, I could understand it. The only thing it might be is the ecobee. As anyone who has one knows the system has been on the fritz for several weeks. If that API was inaccessible it might have gotten itself into a loop trying to poll the server.

I'm past the webcore argument at this point. Just trying to offer suggestions. We don't have tools and you have to work with what you have. Webcore is the only app you mention and I don't have access to view anything on your hub. You could have influxdb which was also flagged at some point too. Now that I know you only have 4 devices and 4 pistons in webcore, ya, it might not be that. You can also view all the webcore subscriptions and schedules by clicking on the cog icons in your app list. This sometimes helps to see app's that do a lot of stuff.

I also didnt know you were away for days whenever this happened. So any log tricks won't help. The few times this has happened to me I was able to get to the past logs and get an idea before they were over written. I minimize all logging of non essential things so that I can see errors in the past if anything comes up.

Maybe others will have good suggestions. Good luck.

I have this exact problem lately. Last week support pointed out that I was using the HEM in secure mode and that my custom driver might be the issue. I made the changes to insecure and using stock DH but the hub locked up last night again. Waiting for next move from support now.

Sorry, you lost me here. Hem?

I don't have any secure Z-wave devices (no locks and hub set to secure only locks/doors).

Do your 'random' lockups tend to be noticed when you wake up in the morning? I have had a few and notice a trend that we always identify the issue in the morning. My hypothesis is that the hub has degraded over the better part of week of run-time, and then the middle of the night cleanup/maintenance routines might push it over the edge. Just a theory...

I don't have too many custom apps in common with you, except possibly Device Monitor. Are you running this one?

If so, it might be worth removing for both of us to see if it is the root cause. I know some newer apps have been released recently that may do a better job.

1 Like

Thanks!

No, i'm running a port of Device Monitor from ST that I can't even find the forum post for anymore. It has been rock solid otherwise. It does check every 30 minutes though so it is one of the only tings still running when I'm away. I could reduce the number of devices it monitors. Currently I have it only monitoring my Hubduino and Omni thing devices but I have it set for all. I could set it to 1 per board.

Oh! Also, the lock up 2 days ago....my omnithing device also locked up!!! I'm running it with 1 contact sensor and 2 switches on an RPi. It would not update until I did a reboot on my RPi. So, Daniel's solution is subject to the same "vulnerability" as Hubduino is. I was a little disappointed to see that happen. But I guess I can save the time of moving everything from Hubduino to OmniThing now. lol But that's a separate topic. If i could solve the lockups, it wouldn't be a problem anymore. Grrrr. :rage:

Who was the original author of the version of Device Monitor that you're running from ST? The version I linked above was originally a ST SmartApp as well. I just want to double check with you as I would love to find a common denominator aside from HubDuino. :wink:

You're right! It is the same one! I was thrown off by the title of the thread. I wonder if that's it? It seems odd that that's the only one we have in common and we're both getting lock ups. Okay, I'm going to delete it now. If that's it, I'm gonna really laugh because the whole reason I added it was to try and deal with lockups in the first place! Sometimes, we are the cause of our own destruction. Excuse me while I go into a Hulk rage for a few minutes. HULK SMASH

I cannot add anything of real value except that I used the Device Monitor app many months ago and had to stop using it because it "seemed" to cause my hub to crawl at times. It was one of many apps that I removed at the time so I can't say if it was the cause (my gut tells me yes)..and it's also why I am hesitant to use any apps that subscribe to a lot of devices. I already use sharpTools, Alexa, GH and InfluxDB.

On a side note, I too have been having occasional random lockups..about once every week to 2 weeks. I just updated my InfluxDB to @ogiewon latest version with async calls and I removed Alexa TTS (using chromecast in the meantime). Hopefully one or both of those help.

2 Likes

Home energy monitor the aeotec gen5

Oh...duh! Total face-palm moment. Sorry. (I work in the medical device field and we have a field called HEM that deals with the level of hemolysis in a sample...and I was stuck on that.)

Yeah, i've heard of z-wave secure casing other issues. It seems if you've removed it from secure that should solve that problem (or at least stop it from hiding the real problem). So many red herrings you have to chase down before you find the real culprit.

That was one of the reasons I didn't implement the InfluxDB. (it's also REALLY f*ing confusing so that was a big part too. :stuck_out_tongue_winking_eye: ) This is also why I asked the question the other day about having logging in place for so many rules and whatnot. I know the act logging a thing can sometimes be more labor intensive than the thing itself. I've seen that in some of the systems I work on for my day-job. Sometimes the tools you use to track down a small problem end up causing larger problems rather than helping you fix the small one.

I've removed Device Monitor and we'll see if that has any impact. Only problem, I've got to wait a good 3 weeks before I'll know for sure.

1 Like

Just looking at the code to that, it subscribes to all the events of all your selected devices and calls eventCheck every time one of them is triggered (look at the subscribeDevices method). That could potentially add up to a lot of processing. For example if you had a group of 5 devices that turn on at once, its going to process each on command while your hub is also trying to do other things. In my "good night routine" I could see this being a problem as it goes through and turns off a lot of stuff.

Just throwing that out there.

I had that function (logging every time a device triggers) turned off. That would be WAAAYY too much going on. It was only logging based on time and I had it set for 30 minutes. I wasn't even logging battery level.

I just removed "Device Monitor" from my hub, in the hope that it might be causing issues. My Prod hub does not have "Alexa TTS" installed on it, as I thought it was an obvious suspect weeks ago. I too have been using chromecast for now.

My custom Apps are:

ABC-Advanced Button Controller
InfluxDB Logger
SmartTiles

My Hubitat Apps are:

Amazon Echo Skill
Button Controllers
Chromecast Integration Beta
Google Home
Groups and Scenes
Hubitat Dashboard
Hubitat Safety Monitor (water leak detection only)
Hubitat Simple Lighting
IFTTT Integration
Life360 Connector
Lutron Integrator
Mode Manager
Motion Lighting Apps
Nest Integration
Rule Machine

If we can find the overlapping apps, we mind be able to pinpoint the problem one(s).

1 Like

I overlap on all of your Hubitat Apps almost. Mine are:
Amazon Echo Skill
Button Controllers
Chromecast Beta
Google Home
Groups and Scenes
Dashboard
HSM - Security + Batter + Smoke
Hue
IFTTT
LCM
Lutron Integrator
Maker API
Rule Machine (147 rules)
SharpTools
Z wave Poller (for 1 z-wave device)

See, my first thought was Ecobee. But I see you use Nest. Plus, I have logs for all the issues with the ecobee servers being kaput all week. And when the hub locks up, there's no logging going on.

1 Like

This may be of help to the logging question farther up in the thread. Obviously if the hub locks up, the logging will stop, but it might give a place to start. You'll need a RPI or another machine to run the nodejs on.

SmartTiles was one of my suspected apps a while back...again because of the excess logging and event subs. The only reason I haven't removed SharpTools yet is because it's such a PITA to setup.