[RELEASE] Hub Watchdog - Simple way to monitor if your hub is slowing down or not

I just did some major changes on my Z-wave hub , so no fresh data on that.
also no data on Smartthings hubconnection.

With speeds you see,my Hubitat does great job,
no humanly recognizable slowdowns

Simply FYI, minor cosmetic bug when adding Child app Data Device (misspelling of "Vitual" (sic)). No effect on operation.

1 Like

@bptworld - I love your work with this app and would like to suggest an enhancement request.

I use this app to trigger a process restart for Hubitat when thresholds are exceeded (to prevent system slowness problems) using the rule indicated below. I would like to see the ability to exclude a timeframe to run my interval checks for this particular child so the maintenance window is excluded e.g. 2-3am. This time period can greatly skew the overall results of hub performance (mean.)

Thoughts?

added to the list

1 Like

New version on GitHub...

Child:
1.1.0 - 07/10/20 - Added Maintenance Override Options

3 Likes

You rock!

Updated and implemented, looks great!

You know what might be helpful for this is to be able to set a threshold of the number of times in a row a device exceeds the threshold before executing a rule. Or perhaps the number of times over a given time period that it exceeds threshold before executing a rule. Use case would obviously be that I could execute a rule only after a certain number or exceeds to restart the process. Sometimes it only exceeds once due to activity which happens to be running at that instant but then ends and the hub performance goes back to normal.

What do you think?

LJ

09/25/19 - Added a failsafe, test has to fail 3 times before notification is sent.

:man_shrugging:

Yes that is sort of what I am saying, but does that just apply to notifications or does it also apply to running rules? It also seems to only cover a 3 min period

Also, what I was envisioning was a user setting which would trigger a rule after that number of regular scheduled tests. This would show that the problem is more ongoing than just three minutes.

both, changed the wording in app

added

Still will rerun after 1 minute, but will do it for as long as you set the max number of fails to, instead of just 3.


New Child version on GitHub...

1.1.1 - 07/11/20 - Added user selectable 'Max number of times it can fail'

(btw, not tested... I'll let you do that! :wink:)

You are amazing as always! This will help.

I am a bit concerned about running for say over an hour or two since it will work the hub and switch a bit. But this will help. I was sort of envisioning being able to test 2 or 3 times over 2-3 or more hours rather minutes.

I know, they are never going be satisfied... seriously, I am satisfied with you solution. And it should help. Thanks.

Thanks

I wouldn't. I should probably put in a range. How about 1 - 15 minutes? Maybe 20? If your hub is still failing after 20 minutes, you definitely have issues.

The one minute re-run has worked very well for almost a year now. Not going to change that up.

Edit: updated the code on github to have a 1 to 20 range. If you need it higher, search for 'range' and edit the number. :grin:

Limiting the range of repeats would be safer from the software's and hub's perspective. I agree 15-20 minutes would be enough.

The reason i was envisioning being able to set the interval is that right now I do a restart of the hubitat process using dman2306's Hub Rebooter app every day at 4 am as a preventative measure to prevent the kinds of slowdowns I was experiencing in the past. I currently have Hub Watchdog set to test every 1 hour the rest of the day. What I am experiencing on watchdog is 1, 2 or sometimes 3 times a day, I see a slowdown above the Max setting in an hourly test. Then in the next hourly test it goes back to normal. I don't really have a good idea of how long those intermittent slowdowns last They may be just coincidental with the Watchdog test. Thus, I was thinking that if I could wait for 3 hourly test to fail then reboot.

You are right that if it goes more than 20 minutes I have a problem. Perhaps I need to set Watchdog to a shorter interval to positively determine how long these intermittent slowdowns last during the day.

LJ

1 Like

I have three virtual switches with the watchdog driver, one each for z-wave, zigbee and virtual on a dashboard. The selected attribute is lastDataPoint1. When I load that dashboard I receive two lines of the the following warn message in the log.
"Excluded attribute list1 size of attribute > 1024 characters"
Is it an issue or should I just ignore it?

No issue and can/should be ignored. When you place any Device on a dashboard, it looks at all attributes for that device, even though you are only selecting one (which is a BIG waste of resources). It has been mentioned and talked about many times. As long as it isn't the attribute you want to display, simply ignore it.

1 Like

Bryan:
Would you agree that these numbers seem to indicate a serious issue with my zwave mesh?

Although the numbers do look high, there are many factors that could account for this. I don't pretend to be an expert in zwave (or anything else for that matter :wink:) but I'm sure others will help you troubleshoot. The only thing I ask is that you start a NEW thread for troubleshooting.

Thanks.

From 1st post...

Is there a way to restrict when hub watchdog runs? I.e. only during certain times, or in certain modes, or only when a switch is set?

Is that in hub watchdog 1.09?