[RELEASE] Hub Watchdog - Simple way to monitor if your hub is slowing down or not

those are my numbers
its maybe just normal speed
i have timed reboot every night

zwave
Mean Delay: 0.544
Median Delay: 0.468

zigbee
Mean Delay: 4.071
Median Delay: 0.567

virtual
Mean Delay: 0.169
Median Delay: 0.15

150 devices on this hub

My slowdown happens usually after 3+ days without a reboot. This most recent one happened within 24 hours.

195 devices on the hub.

There is one non HE app you may have forgotten to disable.....Hub Watchdog :wink:

3 Likes

on my older hub i have just cloud stuff
virtual only
Mean Delay: 0.066
Median Delay: 0.043
Harmony
Nest protect
roomba
Solar panels envoy
Web pinger
UPS power detection
etc..
I mean some of it is local LAN
there in nothing on zigbee or zwave
( I wish I can use it as repeater)
and I'm planning to move everything cloud on it
(ecobee, Echo speaks etc..)

and shoot just some virtual devices actually used in automation to hub with real devices.

I also wanna do maximum automation in
Hubitat Simple Lighting or Rule machine
and eliminate 3rd party as much as I can.
Because there is 500 people who can help me with building rule
and about 2 who can troubleshoot some solar envoy.
Also troubleshooting will be probably easier.
And official support in most cases.

HEY, lol

3 Likes

easy help, disable it and use :

I don't see this causing many problems.
But man, some numbers didn't turn red when they should so that must be it :slight_smile:

That's my metric of choice...but I love options so I've been keeping an eye on this thread. Looks like @bptworld made a very useful app...and requires a LOT less time investment to setup. That said my NR websocket setup has next to no impact on the hub so I'll stick with it for now.

This app can be setup easily by a lot more users and hopefully we can find some correlations with the hub slowdowns.

1 Like

I did disable it for some time. Hub stayed slow as measured by lights and outlets coming on. So I turned it back on :slight_smile:

I'm about to set it up too on my NAS in docker.
Somebody reported that Raspberry memory is not enough.

It will be awesome if Hubitat release some official built in spedometer.
but it may also open can of worms.
that people who didn't notice slowdown start requesting help based on measured numbers.
I'm good example, for my human perception its not so much noticeable.
It is just numbers, platform work pretty good otherwise.

I get exactly what you mean. In my case, I wanted some "confirmation" that my perception of a slowdown every 3-4 days was not just my imagination. And @bptworld's Hub Watchdog confirmed that .....

I tried finding a "solution" today. In the end, I think that scheduled reboots every 3 days works pretty well ....

1 Like

brutal but effective
image

Every driver we publish updates state based on the device reported state change, none of them issue events when the digital command is issued.
For devices that aren't capable of self reporting (which is only very old zwave devices at this point), in such cases we issue a report request after the command request.

Good to know! To me, that is the only logical way to do it...

But I can point to probably 10 ST or HE user drivers offhand that do the digital events immediately in the on/off/level functions.

Thanks for the info!

Sure, likely they don't understand the device and or protocols well enough to get them to report as required, who knows...

Since it seems almost impossible to find a complete solution to hub slowdowns, and the fact that most of us want the openness and flexibility to run whatever we want on our hubs anyway which will only perpetuate the problem, wouldn't it make more sense to put more effort into making sure that hub reboots don't negatively affect apps/drivers and that best practices are available for code to elegantly recover from reboots? I for one am perfectly ok with doing a scheduled reboot every night. But the problem I have is the effort thereafter to fix up issues that don't recover well (eg. Chromecast beta app and others needs to run through a rediscovery process, I don't have fixed ip addresses due to a router limitation at this time so that causes issues with some WiFi devices etc.). I agree we of course need to keep pressure on fixing underlying issues with hub performance but let's also work on how to improve reboot recovery performance.

Could someone explain the difference between the 'mean' delay and the 'median' delay please.
Isn't the mean delay an average of all the measured delay times? If so, what's the median.
Thanks.

The mean is the sum of all the numbers in the set divided by the amount of numbers in the set . (Average)
The median is the middle point of a number set, in which half the numbers are above the median and half are below. (Most numbers have this value)

Thanks for the reply. Much appreciated.
The reason I asked is the figures you have have given above,

show zigbee to be wildly different. All your median figures are lower.
Just wondering what figure is the best to use.