Node-RED Flow - Hubitat Performance Monitor

Anyone else finding their hub rebooting a lot during the maintenance period? I have my CriticalsForReboot set to 6 and Criticalthreshold to 15. I could raise the threshold, but I do not want to go too high because then when its out of the maintenance window it may be an issue.

I am seeing response times on average from 20-30 seconds during 2-3am, so curious on what others have experienced.

Maybe an option can be added to allow a longer threshold only during maintenance.

20-30 seconds...wow.
What are your response times normally?
When My main hub spikes above 5 seconds I get worried. The worst I've seen is 10seconds and that was because Ihad other nodes running Maker Api calls and backing up logs at the same time. Are you running anything else on a repeat interval through Node Red? I spaced my node interval timings out to reduce the chances of them happening at the same time. Here's my response time graph for the past 24hours.

I am not sure why then, I am not evening using the maker API. I just have two eventsockets, one is for device monitoring and the other is to refresh the InfluxDB for devices to report properly. The average time you can see is around 2 second, although this has gone up recently, used to be below 2.

I do have a Node-Red flow that backs up the config, but this runs at 4:30 which is outside the maintenance window.

I also stripped down as many third party apps/drivers as I could from the hub, whats left I can't really replicate with stock apps/drivers.

Below is all the custom Apps/drivers I run currently, not sure if any of these could cause issues.

Custom apps: Alarmdecoder, MyQ Lite, Presence Central and WATO
Custom drivers: Alarmdecoder, ApiXU, Logitech Harmony Hub, My Aeon Home Energy Monitor v3, My Q Garage Door, Virtual Presence Plus, Switch Timer

Not sure what that 1 minute was, but I normally do not see that.

Here is 3 days you can see its consistent during the maintenance window.

This is probably not your issue, but why do you need two event sockets? Can't just have 2 separate outputs or setup socket relay in nodered?

What is the size of you backups? Mine are usually around 15MB.

Websocket relay wasn't working because it was adding a leading "/" in the path field for the "Listen On" websocket node. So if I used ws://127.0.1:1880 Node-Red would change it to /ws://127.0.0.1:1880 which won't work.

Mine are also about 15MB.

Thanks for this. I found an error in a node, which was causing an error with my table, which was formatted exactly the same as yours.

In the "insert" node after "Check Page",

var query = "INSERT INTO responsetimes(deviceId, displayName, value) VALUES(";
query += "1,";
query += "'Hubitat Elevation Hub',";
query += "'" + msg.responseTime + "'";
query += ")";

msg.topic = query;

return msg;

I think it was totaltime, in the downloaded version, and causing errors in my DB.

2 Likes

I've noticed that my hub slows down every 15 minutes. So I did some digging, and discovered that the maker api call for temperature (other Node-Red thread) was running a few seconds just before this was run/called.

I'll move them away from one another. The maker api call takes about 15 seconds to report in all my temp devices - 21 devices in total.

1 Like

Wow, it makes a big difference to schedule this at a different time to the temp/batteries maker-api call. FYI, a huge difference.

2 Likes

@btk, this perfmon has been working great for me. The only strange thing I'm seeing is that "Critical" messages are not being logged unless they are above a threshold. I may have bad assumptions so I figured I'd reach out to understand your flow logic.

My assumption:

  • Critical alerts should be logged and sent to pushover...just within the rate limiter(depending on time etc)

What's happening:

  • Critical messages are only logged or sent to pushover if it is greater than the criticalsForReboot.

Therefore, if my hub gets 1 critical value, it never logs and I never receive a notification. Is this by design? Please see the section of code that is creating this "issue".

This doesn't affect my influxdb/grafana data because that occurs earlier in the flow. That's actually how I realized this problem. My grafana showed a spike during maintenance that I could not find anywhere in my log file. See the skipped log below (check should occur every 5 mins)

So I un-paired my Aeotec Gen v1 HEM and my response time went from an average of around 2 seconds down 1. I couldn't believe the HEM could cause such a load on the hub for energy reporting.

Also my response time during maintenance has dropped down to an average of 15 seconds, which is significantly better.

I still want to use the HEM so I may need to dig into the driver I was using. I have tried the stock driver but the energy level never updates, so I had to use a custom one.

I had a hub lockup yesterday, but my web socket and log socket both continued to run fine.

Not sure what happened.

All my http stuff was failing, suck as yeelight, and my alarm system, unable to navigate to hub, but the node red was chugging along fine. Ideas?

As I understand it, the websockets run on their own threads outside of the Hub UI. I could be 100% wrong about that, but it's what I have noticed. I think they function the same way as the hub utilities to reboot and shutdown.

Were they still pushing out data or were they just "up"?

They were still pushing out data, and the performance monitor was returning normal times.

So, somehow my Hubs network connection was dead for some (??new??) connections, but OK for some (??old??) connections. Note the response times a little longer just after hub reboot, due backup process. Hub rebooted by me at 9:12am.

Here is my history:

Hub was unrepsonsive from about 4am to some new connections, and reboot at 9:12am
(This app sends a pushover notification upon system reboot.

This is strange because you said the web interface wasn't accessible. The performance monitor calculates the load time of the web interface (apps list), so how is it possible that the monitor is returning 3 second load times? I could understand that the sockets continue to run as they are a separate protocol. @btk any thoughts on this?

This is great. As much as I love Hubitat, the slowdowns are really exasperating. No special large apps loaded, and aside from some chatty z-wave devices not much going on.
I swear smartthings , with its latest version, is faster than Hubitat, which is insane.

With node red has anyone pinpointed areas causing slowdowns, outside of custom stuff?

I would challenge that any given day :grin: It really depends on your environment. The app/driver doesn't have to be large to cause slowness..... It really depends on what it is doing. Misbehaving devices also can cause unresponsiveness as do network issues that are not easily identifiable as they are covered up by other applications.

I had daily slowdowns and now have none, nada, niente... Removed a faulty zigbee bulb, removed the InfluxDB app and use Node-Red for that, removed the original Homebridge implementation and wrote a new plugin that uses the MakerAPI for that, removed the Ecobee app while adding about 25 more devices to one hub.

The thing with these slowdowns is to look broad. People often look for that one single item that is causing it but experience shows that several "smaller" items can compound to a bigger issue.

2 Likes

Yes I already pulled the ecobee app, that helped. I need to get node red to be able to figure what's going on.
So I'm curious, if I load a driver but never attach to a device, can that have any affect?

Shouldn't. If a user driver isn't assigned to a device it never runs. Same with user apps that are added to the user app code area, but not added to the "Apps" section as an installed app.

My main hub graph seems to have settled down a bit since the last update (did it last night). Hopefully it sticks. The huge spikes are during the 2-3am maintainence window.

1 Like

Something happened to my HUB last night around the time of the HUB nightly activities. Clearly a jump in response time.