How to detect that my other hub is locked up?

BLOT: What type of rule or notification can I create on my main hub to notice if the aux hub is not running, when they are using Hub Mesh?


I have two C-7s. One is the main hub with 200+ Zigbee and Z-wave devices. The other is the aux hub, with both radios disabled and only network integrations (Ecobee, HADB, WyzeHub, etc). The aux hub uses Hub Mesh to share all it's collected devices to the main hub. The aux hub has no automations on it (no active rules of any kind).

The aux hub has locked up twice, 6 months apart. By locked up, I mean the main web interface was not responsive. Going to :8081 worked, and allowed me to do a shutdown, unplug, and restart. But that didn't fix it. Both times I ended up needing to do a soft reset, followed by a database restore. This brought the hub back to life.

But the last time this happened it took me 4 days to notice that the hub was locked up. E.g., it stopped producing backups on Aug 23 and I didn't notice until Aug 27 that it was unresponsive. And I only noticed because I went there to install the beta release.

Why didn't I notice? The devices on the aux hub just feed data into rules on the main hub, but none of them are particularly important in day-to-day automations. I just didn't notice the one or two things not happening.

But the main hub should have seen the aux hub stop responding to Hub Mesh. I wish there was an alert about this. @bobbyD

(and I don't know why the aux hub has become unresponsive twice. It's working fine otherwise.)


What type of rule or notification can I create on my main hub to notice if the aux hub is not running?

My initial thought was to create a variable on each hub, and share it between each other. Then have rule (on each hub) update the local variable to the current date/hour (e.g., "MM-DD-HH"). I'd run that on the hour. I could them have another rule to compare the two variables, and notify me if they ever don't match. I'd have this later rule run at 5 minutes after the hour.

Is there a simpler way to detect a hub not responding?

From the main hub you could possibly set up a ping and if the ping isn't returned then notify... (you could also set this up on the second hub to ping the 1st as well)

1 Like

This. In fact, I just setup this same scenario suggested by rlithgow1 with webCoRE and this driver - [Release] Hubitat Ping. I get a notification if there's an issue.

From reading the above linked Hubitat Ping thread, the ping functionality appears to be built into Rule Machine now. You may not even need the driver in that case.

Well, the hub was responding to pings, so that wouldn't catch this.

I supposed I could do a GET request on a maker endpoint to see if that was working.

3 Likes

That's what I just set up. My hubs are named Home and Net.

On each hub, create a Hub Variable:

image

On each hub, shared the variable with Hub Mesh:

image

After sharing the variable on both hubs, you can then link the other hub's variable with Hub Mesh:

image

At that point, the Hub Variable page on each hub should show the variable from both:
image

On each hub, have a rule to update the value on the hour:

image

And finally a rule to run at 5 minutes after the hour to notify me if the two variables are not the same:

It only took a few minutes to set this up.

This assumes that when the hub is unresponsive (the UI won't load) that Hub Mesh is also not functioning. I'll find out if my hub locks up again.

Well, your rule is much simpler than mine! I check if the up time has been updated using a shared Hub Info device:

1 Like

Here's a way to do it with 2 simple rules on each hub - no hub mesh or hub variables needed.

Rule 1 on hub A:
Trigger: http endpoint A on hub A
Actions:
Cancel pending actions
Send notification "Hub B is down", delayed 10 minutes, cancellable

Rule 2 on hub A:
Trigger: Every 5 minutes
Action:
Call http endpoint B on hub B

Rule 3 on hub B:
Same as rule 1, but the trigger is endpoint B on hub B and the notification message is "Hub A is down"

Rule 4 on hub B:
Same as rule 2, but calling endpoint A every 5 minutes

Obviously the timing could be whatever you want, as long as the delay on the notification is longer than the span between http requests from the other hub.

Something like this can also work with a single hub and another automation system. I originally had this set up with my hub calling an endpoint on Node Red. Each call would reset a pending notification from Node Red, similar to what the second hub would be doing above. I've since moved to having the hub call an endpoint on my Uptime Kuma system, which has a built in heartbeat type monitor that alerts if it isn't called for a certain time interval.

1 Like

What about monitoring the devices meshed from the Aux hub? I personally use NodeRed to feed logs and maintain a device list with an update time stamp. I have a flow that runs 3 times a day querying for devices not updated within the last 24 hours, there is a filter since some don’t report or I don’t care about them. I get a push notification with results.

I also have a flow that tracks updates from each of my hubs log sockets and updates a custom device on my LAN hub every 5 minutes. The device on the hub has a scheduled timeout if it doesn’t receive an update within X minutes it turns off and I get notified. If the flow cannot talk to the hub it also notifies me. Basically checking both directions.

If you have an always on computer you can use Uptime Kuma.

I have it configured to monitor all of my home automation systems and send me a pushover notification if something goes down.

4 Likes

Ok I have seen enough people post about Uptime Kuma, finally installed it myself. Now that I know what I am doing with Portainer, was super easy. Now I just have to add Hubitat, Node-Red, HASS, Influx, Grafana, Plex and a few other things...

...this thing is so easy to use why did I not set this up before :person_facepalming:. Couple of clicks and it is monitoring Hubitat web interface for uptime already.

3 Likes

Who needs Grafana Dashboards... :wink:

I've been looking for something like this - thank you for sharing....this looks totally baller.

Anyone know if you can use Pushover to send notifications if something is down?

It has just a couple of options... I already am using a Telegram bot so I set that up for myself.

Screenshots:

image

1 Like

What endpoint do you use?

A local endpoint created within the trigger setup in Rule Machine:

After selecting 'local endpoint' this screen is next:

The text in blue links to: http:/hubip/apps/api/1168/trigger?access_token=145ef9de-571c-42aa-9c7a-0607daaeaa7f

(the number after 'api' and the token will be different for each endpoint)

1 Like