Trying to Debug Hub Lockups

BorrisTheCat · February 2, 2019, 6:19pm

Me too i have the exact same issue but i don't have WC installed any more (it happens with or without it)

stephack · February 2, 2019, 6:26pm

Custom:
ABC
Boot Me Up Scottie
EspIR Manager
InfluxDb
Konnected
My Custom Occupancy Lighting
My Custom Sonos Preset Control

BuiltIn:
Amazon Echo Skill
Chromecast Integration
Google Home
Groups and Scenes
HSM - battery check, water leak and intrusion
Hubitat Simple Lighting
Hue Bridge
Lock Code Manager
Lutron Integrator
Maker Api
Mode Manger
Motion Lighting
Rachio Integartion
RM
SharpTools
Sonos Integration
Zone Motion Controllers

BorrisTheCat · February 2, 2019, 6:29pm

this is what i believe to as it seems to coincide with my issues.

Ryan780 · February 2, 2019, 6:41pm

Mine did happen in the morning but I wasn't even home at the time of the last 2 so no "good morning" routine was running or anything.

BorrisTheCat · February 2, 2019, 7:03pm

yeah i had the same it happened on christmas day when i hadn't been there for a few days and was 2 hours away.

system apps:

Button controllers
chromecast integration
Google home
Groups and scenes
Hubitat Dashboard
HSM
IFTTT
Life360
Mode manager
Motion lighting apps
Nest integration
Rule Machine (41)
SONOS Integration

User apps

Cobra's App (check open contacts, message central, modes plus and presence central)
Welcome Home

Custom Drivers (currently disabled for checking)

Fibaro UBS (no other driver but i have a lot)
Aeotec water sensor 6 (currently using built in driver)
Dual Relay Driver (nothing else about)
Virtual Container (cant see this being a issue so not disabled)
Fibaro dimmer 2 (currently using built in one but it doesn't do scenes so my lights don't work )

srwhite · February 2, 2019, 8:10pm

When I was having this issue I had both my Aeon HEM paired securely and AlexaTTS installed. I reset the hub, the HEM is now on a second hub, paired insecurely, and I ordered a dedicated hub for AlexaTTS and some other cloud integrations.

I’ve not had a return of the crashes. I believe it was due to having the HEM paired securely. When the hub arrives next week I’ll give AlexaTTS a shot.

There is one huge difference to consider between SmartThings and Hubitat when it comes to apps. I do not believe that Hubitat puts any kind of restrictions on app execution time or resources. SmartThings has several throttles, namely the 20 second rule that kills app and driver execution. It is absolutely possible for CoRE (or any ported app) to have a bug that would never occur because of these “guard rails”.

I’m not pointing any blame to those apps, just throwing out something to consider.

kamransiddiqi1998 · February 3, 2019, 9:10am

The only user code I have is homebridge and I experience two lockup’s requiring a physical reboot

JulesT · February 3, 2019, 12:06pm

OK. And I've just had another hub lockup as well.

Looking across the logs - last entry was at 10:54, and everything past that was post reboot.

I'm not running WebCore, or similar, but AM running a bunch of other stuff. I suspect what we REALLY need is to be able to export the system logs via SNMP or similar - given that we fail HARD... my guess is a memory leak or similar is causing the oom-killer to randomly start eating processes, but we'll never get there from in-application logging.

HE guys - I'm happy to run SNMP receivers (already AM for other stuff, to be honest)... but I can't help but echo the points above - it's entirely possible that I'm the architect of my own downfall on some of the drivers that I've written... but there's no instrumentation that would help be deduce if that's so.

-- Jules

PS. Logs from the system below:

dev:4822019-02-03 12:01:34.950 pm debuginitialize...

dev:2832019-02-03 12:01:34.680 pm debugDevice Initialized: (Nest Eventstream)...

app:862019-02-03 10:54:45.171 am warnpoll- force:false, type:null, isPollAllowed:true

app:1622019-02-03 10:54:05.027 am debugSending DEVICE Event (Living Room Speaker | UDN: uuid:856f546c-795b-1848-0080-0005cd5113a0) to Homebridge at (192.168.2.103:8005)

sys:12019-02-03 10:53:05.540 am warnReceived data from 192.168.2.146, no matching device found for 192.168.2.146, C0A80292:E4A1, 0005CD3D02BC or C0A80292.

sys:12019-02-03 10:53:05.354 am warnReceived data from 192.168.2.146, no matching device found for 192.168.2.146, C0A80292:E4A0, 0005CD3D02BC or C0A80292.

app:1622019-02-03 10:52:00.575 am debugSending DEVICE Event (Hallway Speaker | UDN: uuid:52b2dfaa-91b3-1f90-0080-0005cd71ca34) to Homebridge at (192.168.2.103:8005)

app:3912019-02-03 10:51:56.829 am infopostToInfluxDB(): Posting data to InfluxDB: Host: 192.168.2.247, Port: 8086, Database: Hubitat, Data: [water,deviceId=386,deviceName=Boiler\ Leak\ Sensor,groupId=null,groupName=Home,hubId=1,hubName=C4:4E:AC:1D:92:E4,locationId=1,locationName=Office\ Hubitat,unit=water value="dry",valueBinary=0i]

Interestingly... the last log entry was from Nest Integration before it fell over.

mpoole32 · February 3, 2019, 3:01pm

The last time I experienced a hub lockup was back when an update included the ability pair non-secure zwave devices. And then this morning I had this complete lockup. Not sure if these Dashboard errors are a result of the lockup or if they someway caused the lockup. These are the only errors recorded in past logs.

Edit
Just noticed the automated backup happened around this time. Normally that backup happens + or - 3:00am.

Ryan780 · February 8, 2019, 11:40pm

So, @ogiewon, I think we might have found our culprit. Iwas gone for3 days this week with no activity at home and no lockup on my hub. I think Device Monitor was the offender. How about you? Anything since you removed it?

cuboy29 · February 8, 2019, 11:42pm

I woke up this morning to an extremely sluggish hub. Painfully slow to navigate the menus but I was able to go in and disable influxbd then the hub became responsive again.

I was running the lastest version of influxbd too. Would really love to get to the bottom of this.

Ryan780 · February 8, 2019, 11:45pm

One of the reasons I've been avoiding installing that one. It's gotta be a huge resource hog.

ogiewon · February 8, 2019, 11:51pm

I removed Simple Device Monitor a while back, but yesterday my hub stopped running automations. I could still log into it, though. I contacted Support and they say my hub ran out of memory. So I must have an app that is killing it. Right now, InfluxDB is still my prime suspect. I use InfluxDB on my Development Hub as well, and it has never locked up. The mystery continues.

mike · February 8, 2019, 11:54pm

I had a lockup soon after installing device watchdog. I've removed it and none in two weeks.

JasonJoelOld · February 8, 2019, 11:54pm

We need a daily scheduled reboot feature like my Caavo has. Lol.

csteele · February 9, 2019, 12:33am

Maker API will do that for ya, I believe. I wouldn't recommend it however. A Daily reboot desire means there's a problem that the daily reboot will hide. "Normal" is to have zero reboots, except upgrades. Hiding the problem is great for ostriches.

Ryan780 · February 9, 2019, 12:36am

You can't reboot the hub through the maker API. The only way is through a PUT request to http://hub-ip-address/hub/reboot. Maker API works off of GET requests and does not have a command for reboot.

vjv · February 9, 2019, 12:38am

Please send me the time of the programmed reboots of your hubs....

Ryan780 · February 9, 2019, 12:56am

I don't have mine on a timed reboot. I'm just telling you how you can reboot it.

vjv · February 9, 2019, 12:57am

I'm just joking! Lol