RM and a Reboot

Yeah Iā€™m a big fan of watching and reading about what you have done. Iā€™m struggling on the debug here.

Yeah Iā€™ve now switched to doing individual repairs as opposed to an overall repair. That seemed to at least clear the routes on the zwave page which were then filled in later with ones that look believable. Iā€™m sure people were exhausted with me complaining about that page and the routes it was showing. Doing individual repairs and seeing the eventual update at least made me feel like something was improving...

I made a couple of what I hope were helpful tweaks over the weekend, doing repairs now and will let things sit for another week to see if the routes and speed improve. Everything works fast except my zwave network (zigbee, wemo, Sonos and LIFX) and are using the same automations sometimes so Iā€™m really trying to narrow down why it specifically has a problem. Itā€™s not as big as many of you on here which I thought pointed to a weak mesh issue but they are almost all plugged in devices now, a small area and they eventually work, so now Iā€™m suspecting something else.

Everyone always says you should chase down the culprit causing the hubs to slow down, but there are no good ways to go about this. No utility to show current load, or over time... nothing. I spent countless hours with Support previously trying to figure that stuff out, and it was a bit frustrating to not really come away with a solid answer. My solution was to buy a second hub, and ultimately still have to reboot every few days. I put up with it because I don't want a cloud-based solution.

Run the enableStats link and let it run for about 5 minutes. Then run the stats link and take a screenshot. Run the disableStats when you're done. Post the results and let's see if there's something that stands out.

http://hubitat.local/hub/stats
http://hubitat.local/hub/enableStats
http://hubitat.local/hub/disableStats

1 Like

Are you having to reboot both hubs?

ummm, ok I'll bite what does all this mean??

App or Device Stats enabled: true is whether or not the stat collection is enabled.

Device stats start time: xxxxxxxxxxxx this is the unix time

Device Stats total runtime is how long the stats were running since last refresh of the stats page.

Device or App id is the database index for the device.

Runcount is the times the device or app was triggered in the collection time. Runtime is in ms.

Total runtime is the total ms is how long the device was running during the collection (each execution runtime added together).

Average run time is the average runtime of the device or app during the collection.

Problem devices usually stick out like a sore thumb on the average run time or runcount.

If you have a problem device you can use the following links and replace with the device or app id for the app to take you to the device.

http://hubitat.local/installedapp/configure/APPID
http://hubitat.local/device/edit/DEVICEID

1 Like

All your runtimes look fine btw. Here are my current stats for comparison.

Problem devices or apps average run time typically look like my app id 1065 but this one actually is a web app and not local so this is typical for this one.

Another issue you might see is a high runcount. Something that is looping a lot can cause issues.

1 Like

I end up rebooting both hubs about 3 times a week, yes. This metrics URL is very interesting... I wonder if there's some way we can scrub/store this into something that can give us visual representation over time. hmmm

I'm sure you could since its basically just a webpage. Running stat collection continuously isn't recommended though since this isn't a published feature and wouldn't be supported if you ran into issues because of it.

What apps or devices are common between the two? There any custom drivers being used?

If you find that you are experiencing slowdowns, I also recommend looking at your free memory over time.
The following is from the Hubitat support staff :slight_smile:
This release brings us a couple of new URLs I'd like to throw out there.
/hub/advanced/freeOSMemory return free OS memory, in K, just like free command that it runs
/hub/advanced/freeOSMemoryStatus returns a GREEN/ORANGE/RED status. I've picked arbitrary numbers of 120000 for ORANGE and 50000 for RED, but that's just something to start with.

I suggest running the first option at multiple points to see if your memory is low,

Oh that's awesome! I can monitor it periodically.

Oh yes, definitely some custom drivers. IOTaWatt, hubduino, tons of hubconnect v2 RC2 drivers, iphone wifi presence sensors, you name it. Things to make the system useful. There are more 'experimental' drivers/apps on my no-radio hub, but out of necessity, I also have custom drivers for actual radio-based devices on my secondary hub. Many of the stock drivers are missing important features, or are just less-reliable than the custom stuff found buried in the forums.

My complaint has never been that that the hubs should be rock-solid with custom stuff, but simply that it's damn near impossible to self-diagnose the problems without any tools to track down the culprits. Would be amazing to see load per RM rule, overall load, memory, etc. Real-time health, essentially. It's nice to see these undocumented suggestions here, which might help me out over time.

I use things like device watchdog to let me know when it's time to reboot, but sometimes that doesn't catch it, and I still find out due to delayed actions for lighting, guest motion notification, etc. I always laugh when I get notified I have a guest like 5 seconds after they knock, when it took them 30 seconds to make it up my driveway. Just shake my head and smile -- oh well! haha

One thing I can say for sure is that my C4 (experimental) responds MUCH slower to UI navigation than my C5 (radio/more-pure.) The difference is staggering. Like 5-10 seconds to get my device/apps to display on C4, where C5 is almost instant.

Energy monitoring can sometimes be extremely chatty.

Check the hub stats links I posted earlier. Maybe they will show you something that is acting up on your C4.

Yeah this will be handy -- out of curiosity, what's the parasitic loss by having the stats enabled?

Device Stats enabled: true
Device stats start time: 1603139410679
Device stats total run time: 4891
device id 674 runcount 1 total runtime 8 average run time 8
device id 675 runcount 1 total runtime 4 average run time 4
device id 676 runcount 1 total runtime 9 average run time 9
device id 677 runcount 1 total runtime 2 average run time 2
device id 678 runcount 1 total runtime 2 average run time 2
device id 679 runcount 1 total runtime 2 average run time 2
device id 680 runcount 1 total runtime 2 average run time 2
device id 1089 runcount 18 total runtime 129 average run time 7.1666666667
device id 681 runcount 1 total runtime 3 average run time 3
device id 682 runcount 1 total runtime 2 average run time 2
device id 683 runcount 1 total runtime 7 average run time 7
device id 684 runcount 1 total runtime 2 average run time 2
device id 685 runcount 1 total runtime 2 average run time 2
device id 686 runcount 1 total runtime 2 average run time 2
device id 687 runcount 1 total runtime 2 average run time 2
device id 688 runcount 1 total runtime 6 average run time 6
device id 689 runcount 1 total runtime 2 average run time 2
device id 690 runcount 1 total runtime 2 average run time 2
device id 1262 runcount 1 total runtime 22 average run time 22
App Stats enabled: true
App stats start time: 1603139410684
App stats total run time: 4895
app id 931 runcount 4 total runtime 136 average run time 34
app id 937 runcount 2 total runtime 21 average run time 10.5
app id 899 runcount 2 total runtime 60 average run time 30
app id 1 runcount 2 total runtime 115 average run time 57.5

Need to let it run for a few minutes... Refresh the stats after about 5 minutes or so.

Just for fun...

My 'experimental' C4:

Device Stats enabled: true
Device stats start time: 1603139410679
Device stats total run time: 1052231
device id 674 runcount 36 total runtime 321 average run time 8.9166666667
device id 675 runcount 36 total runtime 306 average run time 8.5
device id 676 runcount 36 total runtime 280 average run time 7.7777777778
device id 677 runcount 36 total runtime 215 average run time 5.9722222222
device id 678 runcount 36 total runtime 276 average run time 7.6666666667
device id 679 runcount 36 total runtime 349 average run time 9.6944444444
device id 680 runcount 36 total runtime 252 average run time 7
device id 1089 runcount 1003 total runtime 17958 average run time 17.9042871386
device id 681 runcount 36 total runtime 251 average run time 6.9722222222
device id 682 runcount 36 total runtime 360 average run time 10
device id 683 runcount 36 total runtime 390 average run time 10.8333333333
device id 684 runcount 36 total runtime 337 average run time 9.3611111111
device id 685 runcount 36 total runtime 316 average run time 8.7777777778
device id 686 runcount 36 total runtime 275 average run time 7.6388888889
device id 687 runcount 36 total runtime 321 average run time 8.9166666667
device id 688 runcount 36 total runtime 333 average run time 9.25
device id 689 runcount 36 total runtime 379 average run time 10.5277777778
device id 690 runcount 36 total runtime 263 average run time 7.3055555556
device id 1262 runcount 36 total runtime 334 average run time 9.2777777778
device id 1258 runcount 167 total runtime 2036 average run time 12.1916167665
device id 833 runcount 5 total runtime 32 average run time 6.4
device id 1261 runcount 4 total runtime 34 average run time 8.5
device id 1240 runcount 18 total runtime 147 average run time 8.1666666667
device id 673 runcount 35 total runtime 23924 average run time 683.5428571429
device id 86 runcount 34 total runtime 1131 average run time 33.2647058824
device id 1264 runcount 5 total runtime 732 average run time 146.4
device id 85 runcount 34 total runtime 882 average run time 25.9411764706
device id 738 runcount 5 total runtime 32 average run time 6.4
device id 1246 runcount 6 total runtime 89 average run time 14.8333333333
device id 1247 runcount 4 total runtime 79 average run time 19.75
device id 419 runcount 15 total runtime 6283 average run time 418.8666666667
device id 1249 runcount 4 total runtime 70 average run time 17.5
device id 1248 runcount 5 total runtime 65 average run time 13
device id 1243 runcount 5 total runtime 75 average run time 15
device id 1250 runcount 3 total runtime 78 average run time 26
device id 1284 runcount 1 total runtime 38 average run time 38
device id 1285 runcount 1 total runtime 23 average run time 23
device id 1286 runcount 1 total runtime 36 average run time 36
device id 196 runcount 9 total runtime 2165 average run time 240.5555555556
device id 1287 runcount 1 total runtime 7 average run time 7
device id 132 runcount 7 total runtime 1026 average run time 146.5714285714
device id 194 runcount 6 total runtime 1395 average run time 232.5
device id 420 runcount 6 total runtime 2660 average run time 443.3333333333
device id 1288 runcount 1 total runtime 14 average run time 14
device id 195 runcount 6 total runtime 1613 average run time 268.8333333333
device id 1289 runcount 1 total runtime 33 average run time 33
device id 545 runcount 3 total runtime 685 average run time 228.3333333333
device id 1290 runcount 1 total runtime 6 average run time 6
device id 1172 runcount 2 total runtime 71 average run time 35.5
App Stats enabled: true
App stats start time: 1603139410684
App stats total run time: 1052284
app id 931 runcount 531 total runtime 15382 average run time 28.9679849341
app id 937 runcount 264 total runtime 2923 average run time 11.0719696970
app id 899 runcount 343 total runtime 12032 average run time 35.0787172012
app id 1 runcount 422 total runtime 34359 average run time 81.4194312796
app id 920 runcount 85 total runtime 2338 average run time 27.5058823529
app id 921 runcount 4 total runtime 348 average run time 87
app id 5 runcount 24 total runtime 316 average run time 13.1666666667
app id 387 runcount 22 total runtime 3691 average run time 167.7727272727
app id 907 runcount 4 total runtime 375 average run time 93.75
app id 232 runcount 7 total runtime 1386 average run time 198
app id 908 runcount 4 total runtime 474 average run time 118.5
app id 914 runcount 11 total runtime 6137 average run time 557.9090909091
app id 910 runcount 4 total runtime 453 average run time 113.25
app id 919 runcount 10 total runtime 1302 average run time 130.2
app id 909 runcount 3 total runtime 349 average run time 116.3333333333
app id 906 runcount 3 total runtime 300 average run time 100
app id 911 runcount 3 total runtime 251 average run time 83.6666666667
app id 491 runcount 3 total runtime 266 average run time 88.6666666667
app id 913 runcount 10 total runtime 4496 average run time 449.6
app id 938 runcount 23 total runtime 5298 average run time 230.3478260870
app id 939 runcount 21 total runtime 3775 average run time 179.7619047619
app id 36 runcount 1 total runtime 54 average run time 54
app id 940 runcount 19 total runtime 3348 average run time 176.2105263158
app id 49 runcount 1 total runtime 926 average run time 926
app id 941 runcount 18 total runtime 3214 average run time 178.5555555556
app id 942 runcount 18 total runtime 2403 average run time 133.5
app id 943 runcount 15 total runtime 3037 average run time 202.4666666667
app id 43 runcount 4 total runtime 555 average run time 138.75

My 'reliable' C5 (radio devices):

Device Stats enabled: true
Device stats start time: 1603139623747
Device stats total run time: 881981
device id 196 runcount 45 total runtime 1500 average run time 33.3333333333
device id 1028 runcount 959 total runtime 24030 average run time 25.0573514077
device id 100 runcount 18 total runtime 110 average run time 6.1111111111
device id 136 runcount 61 total runtime 1240 average run time 20.3278688525
device id 101 runcount 21 total runtime 104 average run time 4.9523809524
device id 240 runcount 5 total runtime 240 average run time 48
device id 241 runcount 5 total runtime 176 average run time 35.2
device id 242 runcount 5 total runtime 184 average run time 36.8
device id 227 runcount 5 total runtime 236 average run time 47.2
device id 243 runcount 5 total runtime 197 average run time 39.4
device id 238 runcount 5 total runtime 252 average run time 50.4
device id 239 runcount 5 total runtime 194 average run time 38.8
device id 266 runcount 5 total runtime 1970 average run time 394
device id 309 runcount 5 total runtime 2188 average run time 437.6
device id 324 runcount 44 total runtime 1358 average run time 30.8636363636
device id 105 runcount 18 total runtime 119 average run time 6.6111111111
device id 771 runcount 55 total runtime 327 average run time 5.9454545455
device id 193 runcount 6 total runtime 22 average run time 3.6666666667
device id 932 runcount 19 total runtime 129 average run time 6.7894736842
device id 137 runcount 6 total runtime 98 average run time 16.3333333333
device id 740 runcount 8 total runtime 58 average run time 7.25
device id 292 runcount 2 total runtime 39 average run time 19.5
device id 325 runcount 3 total runtime 38 average run time 12.6666666667
device id 327 runcount 3 total runtime 19 average run time 6.3333333333
device id 104 runcount 18 total runtime 107 average run time 5.9444444444
device id 143 runcount 5 total runtime 183 average run time 36.6
device id 481 runcount 1 total runtime 23 average run time 23
device id 71 runcount 1 total runtime 11 average run time 11
device id 135 runcount 1 total runtime 7 average run time 7
device id 144 runcount 1 total runtime 8 average run time 8
device id 641 runcount 2 total runtime 86 average run time 43
device id 279 runcount 1 total runtime 10 average run time 10
device id 964 runcount 1 total runtime 10 average run time 10
device id 73 runcount 1 total runtime 11 average run time 11
device id 138 runcount 4 total runtime 25 average run time 6.25
device id 102 runcount 2 total runtime 13 average run time 6.5
App Stats enabled: true
App stats start time: 1603139623747
App stats total run time: 881999
app id 866 runcount 593 total runtime 20510 average run time 34.5868465430
app id 130 runcount 21 total runtime 1420 average run time 67.6190476190
app id 385 runcount 21 total runtime 1644 average run time 78.2857142857
app id 95 runcount 5 total runtime 526 average run time 105.2
app id 52 runcount 49 total runtime 1341 average run time 27.3673469388
app id 36 runcount 70 total runtime 3438 average run time 49.1142857143
app id 42 runcount 5 total runtime 1801 average run time 360.2
app id 85 runcount 5 total runtime 2147 average run time 429.4
app id 94 runcount 5 total runtime 3545 average run time 709
app id 129 runcount 19 total runtime 4214 average run time 221.7894736842
app id 386 runcount 19 total runtime 4500 average run time 236.8421052632
app id 514 runcount 2 total runtime 24 average run time 12
app id 100 runcount 6 total runtime 899 average run time 149.8333333333
app id 99 runcount 2 total runtime 546 average run time 273

On the C5:
Device 1028 and app 866 both seem to have an excessive amount of runs compared to the rest of your devices.

On the C4:
Device 1089 and 673 app 931, 899, and 1

C4 device 1089 - HubConnect device for C5
C4 device 673 - IoTaWatt Parent device
C4 app 931 - WebCore (I just installed it about an hour ago for fun.)
C4 app 899 - HubConnect app for C5 hub.
C4 app 1 - Dashboards

C5 device 1028 - HubConnect device for C4
C5 app 866 - HubConnect app for C4 hub.

Generally-speaking, it sure would be nice if the hub could look at runtime/load for a certain task and alert the user that the experience may become unstable/less-desirable DUE to a certain activity.

Been asked for a bunch but I wouldn't hold my breath. The devices and apps you listed should be fine assuming you aren't seeing a bunch of errors in the logs. Dashboards seems excessive though. You have a bunch of tablets open all the time?

I only have a single dashboard open on my PC while working, so it seems odd that it would be pulling many resources. Interesting.