I got one of these pop() errors, but haven't been able to reproduce it. Based on previous posts, it sounds related to concurrent instantiations of the same rule.
My rule usually works fine, but this occurrence prevented my lights from turning off and has some suspicious timings in the logs. I'm posting here to better understand edge cases and gaps in my mental model, particularly around assumptions I've been making about concurrency and safety.
The mode was not Theater, so only the ELSE block is relevant. Here are the logs for this rule:
Not shown is that the "Hall / Motion sensor" went inactive at 07:03:47.614 pm, which would have satisfied the wait condition in the ELSE block, beginning the 2-min delay almost exactly 2 mins before the error.
Also suspect is that the previous triggering of the rule was almost exactly 2 mins before that last motion-activated one, at 7:01:34.015 pm, which would have been completing its 2-min delay at almost the same time as the motion active trigger.
Timeline
- t0: Door triggered rule; rule flew through wait condition (without waiting) to the 2-min delay.
- t0+2min: That delay completes at nearly the same time as the rule is triggered again by motion.
- t0+2:10: The second execution's "wait" is satisfied when the motion sensor goes inactive.
- t0+4:10: The second execution's delay ends, followed immediately by the error.
This sequence casts an assumption I've been making onto shaky ground, which is my main reason for this post: Since the first action is a "cancel," all delays are cancelable, and waits are aborted when there's a new instance, I would expect that to absolutely prevent concurrent executions of the rule.
Since RM is event-based and doesn't provide thread management (and doesn't warrant the complexity of such concurrency, as far as I can tell), I would assume that both rule instances were executing in the same thread or otherwise couldn't interfere with each other within "scheduled units" (consecutive actions kicked off by a delay or a wait). Therefore, either the second trigger happened first (which would have canceled the first execution's delay), or the first execution's resume happened first (which would have completed the rule's execution before the second instance ran). Is that the intended behavior, or am I missing something?
I would very much appreciate any help to clarify my understanding! Thanks.