Hub frozen

Hello all, I'm newer to hubitat (about 4 months).

My hub seems frozen, where no devices respond to commands nor do they update if the devices are changed manually. I've tried googling and checking the docs for troubleshooting steps, and based on what I've found I've tried the following:

  • restarting the hub
  • restoring a database backup from 3 days ago, then restarting
  • updating the hub to v2.2.7.126
  • soft rebooting the hub, then restoring from the 3 day old backup

At some point in all the troubleshooting steps, I noticed this SQL error from one of my dashboards (app 5). I tried removing the dashboard, then again restarting but still everything is frozen.

app:5 2021-06-10 13:11:53.461 errorjava.lang.RuntimeException: java.lang.RuntimeException: java.sql.SQLException: A problem occurred while trying to acquire a cached PreparedStatement in a background thread. Query: SELECT D.ID, D.VERSION, D.DEVICE_TYPE_ID, D.DEVICE_NETWORK_ID, D.LABEL, D.NAME, D.CONTROLLER_TYPE, D.ZIGBEE_ID, D.ENDPOINT_ID, D.PARENT_INSTALLED_APP_ID AS PARENT_APP_ID, D.PARENT_DEVICE_ID, D.IS_COMPONENT, DT.NAME AS DEVICE_TYPE_NAME, DT.TYPE AS DRIVER_TYPE, D.CREATE_TIME, D.UPDATE_TIME, D.LAST_ACTIVITY_TIME, D.DISABLED, D.LAN_ID, D.DISPLAY_AS_CHILD, D.DATA AS DATA_JSON, D.MESH_ENABLED, D.MESH_FULL_SYNC, D.collection_id AS room_id, IFNULL(NULLIF(D.MAX_EVENTS, 0), 100) AS MAX_EVENTS, IFNULL(NULLIF(D.MAX_STATES, 0), 30) AS MAX_STATES FROM DEVICE D JOIN DEVICE_TYPE DT ON DT.ID = D.DEVICE_TYPE_ID WHERE D.ID = ? Parameters: [1] on line 196 (getDevices)
app:5 2021-06-10 13:09:07.989 errorjava.lang.RuntimeException: java.lang.RuntimeException: java.sql.SQLException: A problem occurred while trying to acquire a cached PreparedStatement in a background thread. Query: SELECT D.ID, D.VERSION, D.DEVICE_TYPE_ID, D.DEVICE_NETWORK_ID, D.LABEL, D.NAME, D.CONTROLLER_TYPE, D.ZIGBEE_ID, D.ENDPOINT_ID, D.PARENT_INSTALLED_APP_ID AS PARENT_APP_ID, D.PARENT_DEVICE_ID, D.IS_COMPONENT, DT.NAME AS DEVICE_TYPE_NAME, DT.TYPE AS DRIVER_TYPE, D.CREATE_TIME, D.UPDATE_TIME, D.LAST_ACTIVITY_TIME, D.DISABLED, D.LAN_ID, D.DISPLAY_AS_CHILD, D.DATA AS DATA_JSON, D.MESH_ENABLED, D.MESH_FULL_SYNC, D.collection_id AS room_id, IFNULL(NULLIF(D.MAX_EVENTS, 0), 100) AS MAX_EVENTS, IFNULL(NULLIF(D.MAX_STATES, 0), 30) AS MAX_STATES FROM DEVICE D JOIN DEVICE_TYPE DT ON DT.ID = D.DEVICE_TYPE_ID WHERE D.ID = ? Parameters: [1] on line 196 (getDevices)

Happy to provide more info if needed.
What am I missing? What other steps should I take?

Thanks!

@bobbyD or @gopher.ny might be able to provide more insight

While you're waiting for BobbyD or Gopher.ny I wonder if you have a community supported app or driver that's giving you grief...?

@brad5 Not that I've seen in the logs. Is there someplace else I should check?

I have several drivers and apps that I've written myself.

I'm sure your coding is top notch and is not the issue, but it might be good to eliminate them from the suspect list, so to speak.

@brad5 you give me too much credit.... :sweat_smile:

I agree that removing that variable could be useful. I'm not sure what the steps are to "properly" do it.

Only 9 of my 41 devices, and 5 of my 19 apps use built in drivers/apps. Depending on what the steps are to properly disable them, its a good bit of work. I can confirm that the devices using built in drivers are not working.

I should add, I just shutdown the router then unplugged it for about 15 minutes. After startup, I could turn on one switch but then the hub froze again. There are no errors in the logs.

Yeah it is a good bit of work. On the other hand so is writing all that custom code, so I imagine you've got a fair amount of time already invested.

When you say the hub is frozen is the whole UI down as well? Including the diagnostics page on port 8081?

I'd suspect the best way to disable them is simply to remove them. What about making a backup, then removing all of them, validating that the hub is stable, and then adding them back one at a time until it breaks again?

Since shutting down the hub and leaving it unplugged, then restarting, things seem to be working now. Apparently rebooting from the hub interface wasn't enough.
I'm currently walking around the house to see if the lights turn on and off as expected.
I'm still not sure how the hub ended up in this state, if the original problem was with a background query that seems to be fetching a device list for a dashboard.

What you're saying about starting fresh and restoring from a backup makes sense. I'll nuke everything and restore from a backup if my first course of action doesn't pan out. The diagnostics page had always been available (as are all pages, devices, dashboards, etc). Its just like the devices and the hub are not communicating.

Are your devices zwave or zigbee? I've seen reference to a 30 second power down of the zwave radio (which is what you did) can reset a "stuck" radio and clear its cache...

What if you installed a second HE as a lower environment and used hub mesh to expose the devices on the primary to apps running on the secondary? That wouldn't address device drivers but it would isolate custom apps a bit. Once you knew they were stable you could always promote them to your prod hub!

They are indeed zwave, so that makes sense.

Interesting idea of the second hub. Aside from this single issue, I haven't had the need for a staging type environment, so I'm not keen on the additional cost for a second hub.

At any rate, thank you for helping troubleshoot this :slight_smile:

I see a number of reboots with in 10 or so minutes from each other. Was that intentional or did the hub get into some sort of reboot loop on its own?

I'm glad things got resolved, but we still need to figure out how the hub got to this state. Going through the logs now.

Thank you Sir - I initiated all the reboots between 13:32 and 14:58 CDT on 2021-06-10.

The hub got into the state sometime between 2021-06-09 23:00 CDT and 2021-06-10 07:45 CDT. Automation was working fine right up until I went to bed, but was frozen when my alarm went off in the morning.

Happy to provide more info, just let me know what helps.