Just came home to find Hubitat unresponsive. Reboot hub did soft reset

mavrrick58 · May 14, 2021, 10:41pm

Ok so simply put wife and I went out for early dinner trying to avoid the crowds. When we got back i was unable to disarm alarm. When attempting to check hubitat the device was hung and couldn't get into the hub at all.

I pulled the plug to reboot the hub and was able to get back into the hub but nothing was showing configured and the hub was acting as if it was new. I was able to do a restore to earlier today, but wanted to check as to why this happened.

When checking the past logs i do see a bunch of these error message from several devices

java.lang.RuntimeException: java.sql.SQLException: A problem occurred while trying to acquire a cached PreparedStatement in a background thread. Query: UPDATE DEVICE D SET D.STATE = ? WHERE D.ID = ? Parameters: [{"Copyright":"© 2019 C Steele -- v2.0.1","Status":"Current","InternalName":"AeotecMultiSensor6_2","firmware":1.14,"UpdateInfo":"12/28/2020"}, 324] (parse)

The flood of errors appeared to start after the below message

org.codehaus.groovy.runtime.InvokerInvocationException: org.quartz.JobPersistenceException: Couldn't obtain job names: A problem occurred while trying to acquire a cached PreparedStatement in a background thread. [See nested exception: java.sql.SQLException: A problem occurred while trying to acquire a cached PreparedStatement in a background thread.] (sendPing)

The device that triggered this error message is a virtual device using @thebearmay driver for doing pings. It pings my node red server to verify it is up since it helps with automations.

Fortunately the restore appears to have worked perfectly.

thebearmay · May 14, 2021, 11:01pm

I’ll see if I can duplicate, but looks like a database error. I’m guessing that a scheduled ping lost it’s IP address parameter, as to why/how it’s a pretty standard runIn call so???

mavrrick58 · May 14, 2021, 11:10pm

I agree it looks database related. I upgraded the hub today to the latest minor release, so not sure if that has anything to do with it.

My suspicion is that it is related to the new api they added to send pings out. I have 3 devices sending pings anywhere from 1 a min to 1 every 5 min. To me it looks allot like a something may have caused the database to hang from running the ping command so frequently. Until today i have never had to reboot the Hub outside of normal maintenance stuff. I got a ton of errors from various devices from all connectivity type once this started.

thebearmay · May 14, 2021, 11:23pm

I’ve been considering adding the option to select the old method as an alternative, might just go ahead and do it.

Edit: Option is now available.

Equis · June 21, 2021, 3:23am

This just happened to me. Lights, rules, and dashboards stopped working so I rebooted the hub from the app. When it came back up I had the "Get Started" or restore from backup links. I just restored from the second-to-latest backup because I noticed the last backup file size doesn't look like the others...

Just a couple days ago I set up a webCoRE piston to ping a device on my network every 5 minutes, so add me to the ping/hub reset list.

Here's the error I'm seeing in the logs...

It seems like ping keeps the hub pretty busy. I reduced it to only one ping to see if that helps.

neonturbo · June 21, 2021, 4:06am

Ecobee seems to give some people issues. There were a couple recent threads (last 2-3 weeks) that explored making it less likely to crash the hub. Maybe do a search sorted by date for Ecobee?

thebearmay · June 21, 2021, 12:36pm

I'm a little confused by the error as that driver has never had a refresh command, and none of the capabilities it instantiates requires/creates a refresh command. So how did your piston attempt to call it? Only command exposed is sendPing....

It can bring your hub to its knees if you have too many running, and you definitely don't want to have several instances running concurrently. The driver is coded not to send a second ping on its own (regardless of the delay that you select) until the first one has been received for that reason - you can manually override that safeguard by manually issuing a second ping or creating multiple instances of the driver.

Equis · June 21, 2021, 5:50pm

I only have one ping "device" running. I have a piston that issues sendPing once every 5 minutes. This is just to check if the device is available and if not, check another one. It's just to check if the power is on, and if not, shut down the hub using the shutdown command. I don't think it's ever gotten to the second or third IP addresses because the first one is a mesh WiFi point and always available.

It looks like it's generating the error every minute...

I'll pause the piston for now to see if it continues. EDIT: It appears the error is continuing every minute after pausing the piston. Nothing should be calling a sendPing or otherwise using the device.

thebearmay · June 21, 2021, 5:57pm

Open the device page and scroll down to the bottom and make sure that the piston (and webCoRE are the only things listed as using it.) That will tell us if something else could be sending pings, but I'm still curious as to how something is requesting the execution of a command that doesn't exist anywhere within the code.

All of those errors will eventually exhaust available memory and will cause the hub to crash, so important to figure out where the call is coming from.

Equis · June 21, 2021, 6:11pm

So strange. I removed it completely from webCoRE. Nothing is using the device, but look at the Scheduled Jobs. There's a refresh scheduled every minute.

Next step might be to remove the device and make a new one. ¯\_(ツ)_/¯

thebearmay · June 21, 2021, 6:15pm

I would. logsOff I schedule 30 minutes after debug is turned on to turn it back off, but not sure where the refresh came from unless you had a different driver selected at one point.

Equis · June 21, 2021, 9:00pm

So far that it looks like deleting the old device and creating a new one fixed the issue. There are no scheduled jobs now and I added back to my piston. I'll keep an eye on it. Thanks for your help!