So, as time goes on, I am beginning to rely on Hubitat more and more for my home automation. I am beginning to add some (somewhat) critical routines and devices and it occurs to me the hub itself is a single point failure in the system.
Is there a way I can have multiple Hubitat hubs such that if one fails I can switch to another very quickly? Automatic backups of devices and apps on one hub and automatic restore on another.
Ideally, It'd be nice to have both hubs online always, and if the one I am using fails the other steps up automatically. Kinda like the concept of a RAID but with hubs. (Redundant Array of Inexpensive Hubs RAIH? )
How do the Zigbee / Z-wave devices pair to the hubs?
MAC address, other?
Could a physically different hub be configured in such a way that the devices didn't know they are talking to a different hub? Having to pair all my devices to a different hub sounds like a big task.
Devices can only be included to one hub at a time. You can only backup and restore the radios if you subscribe to the backup service. You basically do a migration from one hub to the other. It is not bullet proof. Many times one or more devices need to be re-included (Zigbee) or deleted and re-added (Zwave).
It’s possible to restore a cloud backup to a new hub without having to re-pair all the devices connected to the old hub (although sometimes issues can occur, as @Slate mentioned).
But you’d have to do that manually after your first hub fails, with a spare hub you keep for that purpose.
Cloud backups require a subscription to the Hub Protect service. A local backup can be restored to a new hub but will require more work to re-pair the connected devices.
The other benefit of Hub Protect is that if your hub dies, you get a replacement hub at no additional cost (like an extended warranty). So that can become your next spare hub in the event of another hardware failure.
I keep a spare hub . hub protect in both. before going out of town for months I restore the latest cloud backup c8 pro to the backup hub c8. including radios. then keep it in the box in case neighbor needs to swap it in. I do it with the main hub off and the backup hub inside a faraday box with the antennae removed. Once in will I still have an issue with a Zigbee device or two that has jumped to the backup hub when I put the main online. They just need to be re-paired in place. Zwave has been fine
@thebearmay has developed a way to have a backup hub. Check out this discussion.
The only extra thing the active hub does is to respond to an http request from the backup hub.
The radios for the backup hub are turned off. The backup hub becomes active if the active hub is not active for a set amount of time.
There is sort of one "work-around" to achieve Failover/High-Availability, and that's the "loophole" around using Matter multi-master approaches. That doesn't really help with Zwave or Zigbee..
But you can setup Matter devices to be controlled by two independent hubs (up to 5 independent controllers, I believe), and using hubmesh as a HE specific form of "cluster communications" to maintain a heartbeat and primary/backup status between the two hubs.
You would need to setup a required expression, conditional, or mode to indicate which is the primary, and the flip status if one goes down. There are well known algorithms on how to avoid "split-brain" conditions in clusters using heartbeats: What is a split-brain scenario in a distributed cluster and how can systems prevent or resolve it? And given hubmesh, there is a reliable/fast communication channel that can be shared between different hubs.
Again, the trick is the endpoint devices, as neither Zwave or Zigbee are designed for multi-master controllers - Matter does include that support out of the box - as well as most LAN/Wifi/Cloud approaches will also work.
So if you can remove all Zwave & Zigbee devices (a big ask, I know) - Then yeah, some failover capabilities can be setup with the tools at hand
So I have a backup C8 Pro. Routinely, I will restore a cloud backup from my main C8 Pro to this backup hub and then power down and put it away until the next restore. Is this process functional?