2.2.9.131 Restore after soft reset seems broken

Something is horribly broken with 2.2.9.x soft reset restore process. I have done soft reset many times before, I know the process. I saw the same thing after 2.2.9.130 update, I’m seeing the same thing after the 2.2.9.131 update. I fooled around when I saw the same thing after 2.2.9.130 update, reverting to prior platform versions, restoring prior downloaded backups, but can’t recover now.

Here is what I am seeing:

Download new backup before update to 2.2.9.131.
Update from 2.2.9.130 to 2.2.9.131. Process goes well.
Use HubIP:8081 to get to Diagnostic Tools, log in.
soft reset
At Get Started green screen, choose “Restore from Backup?”
Choose backup made and downloaded just before the Soft Reset.
The backup restore goes normally, hub reboots normally, but goes to the Get Started green screen, not to the hub main menu.

I have tried rolling back to several prior versions of 2.2.9.x, same results.

I have tried restoring several of the downloaded backups that I’ve got. I have quite a few downloaded backups - I never do an update without backup download first.

Stuck in the Get Started green screen loop.

I’ve even tried clicking through the Get Started screens, then restoring one of the previously downloaded backups. Same result, goes to Get Started screen after reboot.

I’ve also tried restoring from on-hub backups, same result.

One oddity I note is that the recent on-hub backups are tiny, as if it’s a new hub. See screenshot.

C-7.

I even tried the magic incantation:

http://myHubIP/hub/acceptTOS

Same result, still can’t get beyond Get Started green screen. Right now, I believe I have the hub running 2.2.9.131 with the database downloaded just before the soft reset, but can’t get beyond green Get Started Screen.

Suggestions?

1 Like

Power down restart?

I was able to bring the hub back up by reverting to 2.2.8.156 and restoring the backup made just before update to 2.2.9.128. Then, update platform to 2.2.9.131, then restore from backup made before 2.2.9.130 to 2.2.9.131 update. Green screen again. So, the backup I made before 2.2.9.130 to 2.2.9.131 update must be bad.

Then, while running 2.2.9.131, I restored the backup from 2.2.8.156. That works, so I am back up.

Looks like 2.2.9.130 made bad backups. I am going to walk through the 5 backups made with 2.2.9.130 to see where the failure was.

I should be able to take it from here. I don't have any idea what went wrong, but I can go forward from the 2.2.8.156 backup. I just lost all the Rule conversions from 4.1 to 5.1, no big deal.

Thanks so much for the suggestion.

Tiny backups indicate empty or nearly empty database. I'll investigate.

3 Likes

For all of you with your hair on fire about Hubitat support being able to access your hub for support purposes, this is the perfect example where a staff engineer, after hours, worked with me not only to get my hub back up, but also to analyze the backups that the hub refused to load. Unknown to me, at some point during 2.2.9.129, the database must have become corrupted and the hub refused to restore any backups made after that corruption occurred. The hub was working fine, no changes had been made between the database being good and then having issues. I had no indication that anything was wrong. Victor is analyzing the last "good" backup and the first "bad" backup to see what happened and to see if he can adjust the Backup/Restore code to handle this better. Fortunately, I have a long trail of backups going back over a year.

Again, this is some of the best support I have seen from any company.

20 Likes

I agree 100%. You won't find better support anywhere.

5 Likes

The lesson to be learned here is that a Backup is only a Backup if it can be successfully restored. As I have always done when I set up a backup strategy, I tested the full loop of Backup, Soft Reset, Restore, when I started doing backups. I didn’t consider that something could silently corrupt the database and leave the hub fully functional through multiple firmware updates, yet cause backups of the functioning (with possibly corrupted database) hub to be unable to be restored.

Looks like I will have to add periodic auditing of the backups to my schedule to validate that the backups can be restored.

Perhaps Victor (@gopher.ny) will be able to figure out what went wrong and add some appropriate validation/sanitation code to the backup/restore process.

Thanks to all who assisted, especially Victor, who went above and beyond the call of duty.

7 Likes

Good stuff - thanks for sharing this success story.

If there's a broader issue to be addressed, I'm confident Victor will get to the bottom of it!

1 Like

Yeah, figured what happened here. There will be restore process changes (in one of the 2.2.9 hotfixes) to make it more robust.

6 Likes

I’ve had similar experiences with @bobbyD, and @bcopeland.

I think the incident that was discussed yesterday would’ve been easily resolved with a simple request, “Next time, would you mind notifying me before accessing my hub”. Hubitat is a small company with dedicated staff, and I’ve found them to be very accommodating.

10 Likes

2.2.9.132 (just released) has revised restore logic that should be more robust, or at least not fail in the same manner.

5 Likes

Thanks. I will give it a shot with my unrestorable backup.

2 Likes

And I will follow your lead and do a test restore on my nonprod hub! Good advice to test the restore process periodically. I do it on other backups regularly, didn't think to do it here!

2 Likes

Well... doing an upgrade to 2.2.9.132 to test a soft reset and restore seems to have killed my poor hub! I mean it just doesn't get more ironic than that! I did the upgrade, it seemed to go ok, then I got to here and it just sat here...

It sat there for about 20 min, so I navigated to the tools page and did a soft reset. That's when I got a message about the database being corrupt and to do a soft reset...

so I did another soft reset and I'm back to waiting at 15%. I tried power cycling, came back online, still stuck on 15%.

Hub 1, Brad 0.

Am now rolling back to 2.2.9.130. Database corrupt again. Did another soft reset, hub came back! Doing a restore. Restore successful, now upgrading to 2.2.9.1.32 again. Will report back in the morning!

2 Likes

Well, Houston, we have a problem.

Starting point:
C-7 running 2.2.9.131, database from the “last good” backup taken Oct 9, 2021, that I restored yesterday to get me back online.

Took a backup with 2.2.9.131, downloaded locally. Looks like the correct size.
Tested that backup by immediately restoring using 2.2.9.131, successful.

Did update to 2.2.9.132. Successful.
Took a backup with 2.2.9.132, then tried to restore. Unsuccessful.

Here is the screenshot of the explosion:

Went to HubIP:8081 (Diag tools) (running 2.2.9.132), did soft reset, tried to restore the good backup that was made and tested with 2.2.9.131. Hung at Initializing Hub: 15% - updating database

Waited 15 minutes, no change. I’m now going to roll back to 2.2.9.131 and the backup I made at the beginning of this. The backup made with 2.2.9.132 just won’t restore on 2.2.9.132 after a soft reset.

1 Like

Noted, looking into it.

1 Like

2.2.9.131 comes back up using Diag Tools, soft reset, restore of the backup that 2.2.9.131 made before update to 2.2.9.132 restores fine.

It may be that the database has some latent corruption that allows the hub to operate, backup, and restore (with firmware prior to 2.2.9.132), but that database may not pass an aggressive validation check.

I've emailed the good and bad backups to Victor, and I will let him pour over it in his spare time. In the meantime, before making any update I will make and download a backup, verify that it will restore, make the update, do another backup and download, and then check that the updated platform can restore its own backup.

1 Like

No, this was a mistake on my part. 2.2.9.133 is out with the fix.
This scenario will be a part of automated test coverage for the next build.

5 Likes

Victor, You The Man!

Running 2.2.9.131, I made a backup (simply to preserve all the states, etc., that had changed at sunset), did the update to 2.2.9.133, made another backup.

Then, for the big test, I restored the last corrupted database from yesterday, the one that had all the rule migrations and cleanup from 4.1 to to 5.1 that I had lost when nothing would restore and I had to reach back to a backup from Oct 8. The corrupt database completed the restore, and all of the upgraded rules are there. Everything works. Thanks so much.

What I may do, now that everything is back, is to export all of the rules, go back to an earlier backup before things became silently corrupted, and import the rules. That would give me a high level of confidence in the running database.

Thanks so much.

3 Likes

Mine is all set too after downgrading and then upgrading again. Test restore worked fine as well. Dunno why I had so much trouble but fortunately it was a nonprod hub :slight_smile:

1 Like