After migrating from very stable C7 setup (it was running near perfect for about 2 years) to C8
I am getting Elevated/Severe Load warnings many times per day. Reboot, Soft Reset followed
by Restore helps for a while but obviously not a solution for the problem9s).
During this warning time frames many RM Rules simply misfired (missing Triggers or whatever
it is). WAF rating use to be near 100% but now very fast approaching 0%. She even requested
to shut down all automations and let her to control everything manually. In my case this is
very easy. I simply need to turn off hub. But of course, I want to return to my very happy days
with very stable C7 setup.
Here is an App stats right after the reboot (few very first high load aps):
I am very confused what I am actually seeing and how to interpret this data.
My BIGGEST question is:
- Where EXACTLY I am looking for the problem?
- Is this App stats a place to look for a problems or I have to check some other places?
(I think, ZWave and Zigbee Meshes are in a good shape but of course, I could be wrong.)
- What is actually a reason(s) for all these high load warnings?
While I was writing the above message another Severe Load warning popped up.
There was less than 15 min after the reboot.
Here the App stats after warning popped up:
Average time hub spent processing each app's method (**Avg, ms** column)
Your Average MS column shows a severe problem. 51,448 AvgMS is almost a full minute. 60,000 would be 1 min eactly... and you're showing 5 Apps where each is 25 seconds or more PER RESPONSE.
% of total is 250% ?? One app is using 2.5 CPU cores of the 4? and you have 5 showing that EACH are using more than 24% of a core?
On my busiest hub, here's my top 5:
I'd go into your Apps page and disable those 5 right away just to see if there's some vicious cycle going on. Wait 30 mins and see if that Severe Load goes away.
First off all - All these apps are about a year old migrated from very stable C7 setup.
How suddenly they became very problematic?
My EE brain refuses to accept and understand this.
Second - I already tried your suggestion. As soon as problematic apps from the top
are paused set of different ones almost immediately takes place.
So, this is not really productive debugging approach. I have 200+ RM Rules.
Here is an app from the top of a list:
Why this app creates a severe hub load?
Here are events from Triggering Light Sensor:
Is it too many Trigger events or something else is going on?
lol wow! most likely. wouldn't hurt to get that straightened out
So where is a bottle neck?
Certainly I can modify a rule and prevent rule re-triggering with Private Boolean and
controlled delay. And sure I will try this.
But I don't have any control for the Sensor itself and its Driver (this is custom Driver
for "Xiaomi Aqara Mijia Sensors and Switches".
Another words, Sensor/Driver still will be generating the same number of Events.
And if this a source of a problem than I have no idea which way to fix it.
Somehow this was not a problem for more than a year running this rule/driver on C7.
Oh okay, the driver is meant to work like that? there's No setting to change lux trigger in the driver? that's like what every 30 seconds or so?