Rule startup latency steadily increasing

I first noticed this on a simple Rule Machine app that was triggered by a button push. The latency between the button event trigger and the execution of the first action was initially small after a hub reboot, but steadily increased until the next reboot.

Further testing showed that this issue also applies to other rules. The latency increase is about 250 milliseconds per day. This may not sound like much, but it accumulates, and soon makes an interactive rule unusable. The latency between the rule’s first action and the second action does not increase. The latency of similar functionality implemented in WebCoRE pistons or custom apps also increases, but much more slowly.

I have a C-7 running version 2.3.5.121. The hub cpu load is consistently below 10%.

Has anyone else seen this? Any thoughts on what causes this or how to debug this?

2 Likes

I'd be inclined to think that the delay is between the devices used as the trigger and the hub. It's also possible you have some errant app/device causing overall latency. So, two things:

  1. Enable all logging for a rule that's misbehaving.
  • Probably want to pick a rule with a single trigger.
  • When the rule triggers and you notice the delay, go to the log page, filter on the device and the rule, and post the the results.
    *That will advise whether the delay is actually happening before or after the hub/rule sees the trigger. (e.g., if the logs show the rule immediately runs, then the problem is probably device->hub)
  1. Check the hubs processing load:
  • Go to the "Device stats" and "App stats" pages from the "Logs" section and overall usage stats. If they aren't high, then you're probably good. If the overall is high, then look at the individual apps/devices to see what you're heavy hitters are.

Is there a way to reset the Device and App stats?

Thanks for your help.

Oh...there used to be a button, but it's gone now. A reboot resets the numbers.

Here is an example log, showing the current 800 millisecond delay from the app event to the first app action. There is minimal delay from the button push to the app event, and from the first app action to the second app action to the device action.

Interesting. I have a device and log times with hub watchdog and i am not seeing a degradation of times.. ie

1 Like

I don't see anything unusual on the "Device stats" or "App stats" pages. No item is above 1% of total. All the usage on the two pages together is about 4% of total.

Can you PM me the hub id? I'll check the engineering logs, there may be some clues there.

The rate of latency increase seems to be proportional to the amount of hub activity/events.

Also, I have observed one rule that often takes a very long time to save after editing (more than 2 minutes in method updated of child app). After this happens the latency immediately drops, but continues to increase from the new value. There is no long save time if I do a second save of the rule shortly after the first save.

1 Like

Did you ever get to look at this? My hub does the same thing, after boot it acts exactly as expected with regard to latency, but after a few days it starts to get ridiculous. A reboot fixes it. Happy to send my hub id if you want to have a look.

Wow ...exactly what I have been noticing..
Whom do I send my hub id, or how do I troubleshoot?
My device and app stats never go above 2% and 5% but i have been getting device with event log full , it is an whole house energy monitor

I'm sure the HE dev's would be interested to look at your setup, but in the meantime, I'd suggest some of the earlier recommendations of turning on logging for some of the key rules where you are experiencing delays, plus I would also suggest installing the Hub Info driver and setting up a device so you can see memory and CPU usage, plus other stats.

I don't see the.latency during running but do see the long time to save rules.

I have seen both the slow saves and the latency executing rules. Not sure they both happen at the same time.

I'm not sure they are the same problem or even related, I'd keep an open mind whether there is one or two issues here.

Have a rule that is exhibiting the issue with logging turned on. Have PMed a screen shot of log to gopher.ny and sent him my hub id.

Also have Hub Info driver installed. Haven't really seen anything on that indicating a problem. CPU & Memory usage seem fine. DB Size grows and shrinks as I would expect.

1 Like

Then I can't suggest anything more.... hopefully you will get to the bottom of it.

2 Likes

Can you update to the latest firmware?
Once the hub is running the latest version, please open the affected rule(s), go to Actions, then click Done with Actions and, once app is back to the main rule page, Done.
There were some recent changes to Rule Machine that are likely to make a difference.

3 Likes

Thanks for the reply. I was on .130 release and seeing the problem but I have now updated to .131. I've resaved a couple of the rules that exhibited the behaviour and have also left one problem one untouched as a control case. Theoretically it should still have the problem and the others not. Will monitor for a few days.

1 Like

Thank you - please let me know if the issue persists.

Unfortunately it appears the problem persists. Overnight the delay between the Trigger and the Action has stretched out to around 800ms. Immediately after reboot it was around 100-120ms.

Here is a log straight after boot:

Here's a log after about 23 hours:

Let me know if you need more info. Also I PMed you my hub id earlier if you need to have a look.