Backup Firmware

New to HE here,

I've had my first bout with HE instability and have found Hub rebooter. I've got hub rebooter set up now, but am still curious about the experience others have had with HE instability, especially as it pertains to random reboots. I've had my hub reboot/or just become unstable a few times over the past few days. the last reboot shows a time that's in the future, which I can only assume is because it hadn't updated the time yet. What's more interesting to me is that the firmware version had been downgraded upon that same reboot. After another reboot it is up and running with the latest firmware, but it did momentarily downgrade. Does HE have a backup firmware feature or is something funky going on with my firmware?

Started with my first Hubitat in 2018 and now have 7 of them. 3 are in a production "array" and I'm very slowly sliding a 4th in to replace my oldest. 3 of them are development hubs.

In total, I've had zero bouts of instability across the entire set of 7 hubs. Ever. None of my hubs has ever rebooted on its own. I've never run a rebooter app/tool.

Nothing of what I'm saying is intended to discount your all too real experience.. my point is only that it's astonishingly unusual in my eyes. :slight_smile:

2 Likes

I'm glad to know your experience has been so fruitful! With you being so experienced and having such tenure with HE, do you have any ideas about what could've caused this?

Resource limits... you're running out of memory in one form or another.

Open a browser tab for Logs and there at the top is:

Screen Shot 2021-04-22 at 10.08.56 PM

And you can use the stats tabs to narrow down your top 'consumers'

You can visit the Apps page and in the upper right corner there's a grey X that you can make active (red)
Screen Shot 2021-04-22 at 10.11.19 PM

and that opens a Disable column.
Screen Shot 2021-04-22 at 10.12.49 PM

Use it in conjunction with the Stats page to disable your big consumers. The point is to debug, to identify where to focus your energy. You can choose to disable a lot and see if you end up with a longer running Hub. That would tell you that you're barking up the right tree.

And so on.. re-enable a few, til you can identify the one or more Apps that are causing the most grief. I realize this advice will make your system next to unusable, but based on your symptoms, I'm not sure it's all that useable today. :frowning:

The same X is available for disabling Devices. Your hunt for an App might identify one or two Devices that the App uses and disabling the device and enabling the App might be hits too.

Take a look at Logs... are you seeing it just scroll and scroll and scroll? You should normally be seeing events at a "human rate" meaning if there are 4 people moving around in the house, just how many gizmos do they set off per second? per minute? At my house it's 2-4 seconds per event.

Screen Shot 2021-04-22 at 10.23.05 PM

That's almost 30 seconds of logs and there's 8 events. It's a representative sample.

You can verify the Hub itself is capable of running for days by doing a Soft Reset, which almost but not quite reverts the hub to factory... and let it run for a day like that. If it doesn't run for a day... I'd guess Hubitat would send you a replacement :smiley:

Once that test is done, then just restore a backup and you'll be back to where you were before the Soft Reset.

I love Soft Reset :smiley: I have (for my development hubs) a "Library" of backups that are just various testing scenarios. I can soft reset and restore a backup and test some idea... then soft reset and go back to the moment before the soft reset. It's a wonderfully useful tool for me.

Ok, that's more than too many ideas to narrow down what is the most likely cause... but you'll get a dozen better ones in just a few minutes.

5 Likes

By way of experience, I have three C7 hubs and have never had a problem. Very sorry that you are having issues but I think @csteele has some good suggestions about possible overload. If it were me, I'd try to "de-automate" (turn much off) for a few days, adding things back one at a time. It may be time for another hub to help lighten the load.

1 Like

Thanks for the tips! There are a few RM rules that I had been having issues with, so I turned on all logging for those. I did end up resolving those issues and disabled most of the logging on those yesterday. I don't think I'm overloading it yet, as I only set it up about a week a a half ago, although it's entirely possible. I am the master of disaster after all. There isn't a lot of activity showing in the stats yet as it rebooted last night, I'll continue to monitor the situation. Although I'm fine with having to do periodic reboots, I'm using to doing that with tech anyway, the thing that really gave me pause was the fact that the build appeared to temporarily downgrade upon that reboot, at least according to the system logs. Has anyone seen that before?

the only time i had any instability was when i was running echo speaks (i think v2). outside of that, i haven't had any issues with my c-5 or c-7

i would take a look at the apps/rules you're running to see what could be causing the instability

I think I may have found a clue, although it's not an official RCA. I had changed some rules that were not running the way I wanted them to. I learned that RM rules uses OR as opposed to AND. So I changed those rules to IF condition rules, and they worked. Later that evening is when the hub rebooted, and I didn't notice until today the the prior rule changes I made had been reverted, and were no longer working as I wanted them to, they'd been changed back to Simple condition rules. I've changed them back now, and will monitor to see if the issue occurs again.

Reverting like that, automatic as a result of a reboot, is due to a corrupt DB. During boot, the DB is checked, if it's found to be corrupt, it rolls back one DB automatically. If that one is corrupt, roll back one more, and so on til all 4 of the internal's have been tried.

It's a stellar reason to download periodic backups to your local PC. You'll find lots of advice in other threads/topics on how to automate it. The process of exporting the DB into the backup file will remove most corruption.

2 Likes

Aha, well I think that answers my question. Thanks for the info! I'll get a backup cycle in rotation and hopefully never have to use it :rofl: