Has C8 made hot spare possible

2 years ago, the hot spare thread was closed with no practical solution.

I think the C8 may have changed things.

Our custom smarthome is in rual western montanna is designed to have zero single points of failure.

The lighting and controls for the entire home are zigbee and zwave based with zero physical controls.

From dual wells to hvac everything has a hot spare. Except the hubitat.

  1. Does the new radio migration capibility allow c8 to c8 migration?

  2. I suspect yes. Is a C8 hot spare configuration now possible with two hubs using the same backup?

  3. Can one hub be configured to watch the other and take over on failure?

Here is the business opertunity for HE:

Sell pairs of C8s with hub protect as a package.

More than just saving time, this system makes it possible to construct homes like ours (1/3rd less wire) that do not risk going dark, even for a few days, due to a single point controller failure.

I would aapreciate an answer from HE soon enough to take advantage of introductory pricing on the C8. I have one on the way and if the answer to hot sparing is yes, I will be buying another.

1 Like

Yes, and I was able to test it.

Cold spare, yes. Hot spare... not out of the box. But I could see a scenario where you did a restore to an offline C8 regularly and had a "near-line spare" so to speak.

Hmmm. I wonder if you could have the hot standby C8 sitting there with radios off and pinging the online C8. If the heartbeat failed it would shut down the online C8 and turn its own radios on. Possibly! But then there would be cloud integrations that need to be reset and the standby C8 would have to assume the IP of the dead C8... could be a bit tricky. Maybe you could put them both behind a load balancer and use. VIP? But now you're starting to get Into some serious complexity.

@thebearmay was working on something similar... I bet this has already occurred to him and he's working on it in the background :). Or at least thinking about it!

3 Likes

Before the C-8 came out I wrote a small app to allow just that scenario: [BETA] Hub Failover - AKA an opportunity to protect or destroy your environment

With the C-8 I need to revisit it and make a few tweeks, but what you’re asking is now more possible than ever.

12 Likes

No, this won't work as you want. The issue is with Zigbee actuator devices (switches, bulbs, etc). These devices keep an internal counter, and the controller (Hubitat hub Zigbee radio) does as well. If the two don't match the device will not respond to commands from the controller.

So, if you have a C-8 backup made any time ago, and restore it migrating the radios, your real devices will be out of sync with the restored controller table values. To get those devices working again they would have to be reset and rejoined (not a difficult process, similar to pre C-8 Zigbee migration).

4 Likes

Thanks, sounds like I should order another c8.

I think creating a resilliant control system might be handled in phases.

All local first followed by online integrations later.

The most crirical functionality is to keep lights, pumps, sensors and switches working without intervention.

In our design, critical functions like light, heat and water do not depend on cloud integrations. Its one of the attractions of the HE philosophy.

A watchdog (ping) function is one way to detect a failure. If this is simple to do, its a start.

However, subtile issues may not get caught this way.

On our idealPV module design, we use a redundant processor to watch what the active processor is doing by looking at its inputs and watching what happens. This was inspired by the space shuttle flight control computer array (5). I'm right in the middle of etl reviewing our fmea so it is top of mind.

In the HE case, a possible approach is one C8 as primary and another c8 as backup, looking at the same input devices running the same code and virtually acting but without transmitting commands. If the backup predicts an action that the primary did not do, the action is performed by the backup, the primary is disabled, the backup becomes the primary and an alert is sent. HE or the local sysop can then deal with the issue without the resident being disrupted.

1 Like

If you want this to work, just don't depend on any Zigbee lights, switches, etc. See above.

I wanted to follow-up on @kent1 's idea. My thought is for a resilient system to

  1. Have a 2nd C-8
  2. Install the 2nd C-8 and take it through the basic setup and registration process.
  3. Shut the 2nd C-8 down and disconnect it from power and the network so it won't be impacted by lightening strikes, power surges, etc.
  4. Periodically do local backups of the primary C-8. I'm choosing local backups because of the possibility that the primary C-8 is fried during, for example, a thunderstorm that also takes down Internet access. In this case I could still access a local backup while a cloud backup wouldn't be accessible.
  5. Periodically plug the 2nd C-8 in and perform software updates

Now in the event that the primary C-8 goes down the 2nd C-8 can be plugged in and a local backup restored to the hub. This would transfer all settings, code, Zigbee devices, Z-Wave devices, etc. from the original C-8 to the 2nd C-8. The system should now be up and running again.

Does this approach make sense or is there something I don't understand?

Thanks in advance

So as long as you don't have any zigbee devices, or if they are low criticality....

I should probably just let someone else comment but here I go.

I didn't think local backups included the Z-Wave and zigbee radio backups. I thought only the cloud backups did?

But I admit I haven't put any thought into this in a long time, so could be wrong.

That is my understanding as well.

Really only workable using the cloud backup with the radio data. I had this set up for a bit in a test mode [BETA] Hub Failover - AKA an opportunity to protect or destroy your environment and it did failover reasonably well, but like Bruce pointed out the zigbee devices are the kicker. LAN integrations can also be problematic depending how they're implemented, but if you're primarily a ZWave installation this might get you there.

It also takes some regular care and feeding, so not completely set and forget.

Thanks @thebearmay Just to confirm, the radio data is only saved in a cloud backup and not in a local backup?

Thanks again

Sorry I was looking at the last email notification of an update to this thread and missed the other replies. I now see that it needs to be a cloud backup for this to work. For resiliency it would be really nice if local backups were equivalent to cloud backups

Thanks

Correct. IIRC the cloud backup also now includes the first 100MB of the File Manager files also.

2 Likes

If its not proprietary, how does the migration process deal with the zigbee to controller counter sync issue for a c8 to c8 migration?

After a cloud backup is made, how long can a network run before the radio data bases are so out of date that a migration is not possible?

For Zigbee actuator devices, the very next time they are activated. Lights, switches, dimmers, etc.

For this reason, we recommend making the cloud backup, and then immediately shutting down the source hub (or at least disabling its Zigbee radio). This is a Zigbee issue. Part of Zigbee security is a counter kept both in the device and the controller. If out of sync, the device will not respond to commands.

1 Like

Ok, I may be overstepping here.

Mod, before you reply, consider looking me up at uspto.gov

Ive got a day job that is a resilient system but for solar so I am openly trying to spur HE or a user to solve the ressiancy issue.

In my home, zigbee, zwave replaced about a third of the copper at the design stage. That means my goal is for HE to be as reliable as wire.

Many on our zigbee devices are 30ft high in the blustrey montanna weather. Over the past 4 years we have had only one failure so the hw is plenty reliable enough, even down to -22.

There are altenative controllers but I like the pholosophy and culture of HE.

It seems what is needed is somthing like continous cloud based migration. Clearly impractical.

What is probably possible is continous H/W based migration.

Use a redundant c8 as a h/w backup device. A host for a realtime, encrypted, command by command backup. Essentially a cloud migration image incrimentally updated where the radio databeses track the primary c8 command by command. Note that the encryped nature of the backup means any 3rd party IP (silicon labs) contained in the image is not exposed to the user.

I inderstand that local backups that include radio databases are a live wire issue here. Please forgive me as a recient poster for touching the wire.

Here is the business opertunity:

I think the firmware that enables a c8 to c8 backup is worth an up charge of $100. That is above and beyond selling at least two c8s to the market that cares about resiliency.

In my case, I would continue with hub protect because I care about layers. A failover backup c8 locally, a remote backup for larger scale events and a warentee for failures.

Community, please chime in.

The reason for hope that this is possible are that cloud migration is possible with its multi step, minuites long process and hub mesh demonstrates realtime hub to hub synchronization.

Respectfully submitted.

Not happening. We have many higher priorities. Believe it or not, this is a sliver market opportunity. Most smart devices are cheap consumer devices, that have multiple failure modes. Wanting to create a hot spare capability is a bridge too far.

2 Likes

The nature of encryption (if it is designed to prevent replay attacks to circumvent it) means that you can't just duplicate an encrypted transaction, repeat it, and have it accepted by the receiver. The frame counters Bruce referred to previously will flag copied/replayed messages and they will be discarded at the destination.

The encryption keys themselves periodically get changed in secure implementations as well (I think Zigbee 3.0 requires that network keys periodically change, The legacy HA 1.2 profile doesnt enforce this requirement).