Multi Hub Task Splitting

This is mostly true, though you also have the App Status page that can give you a view into the current state of your app, plus tools like App Stats. But with logging being entirely under your control, it can be a quite flexible tool during development. This is how all the built-in apps, including Rule Machine, were made, too (well, aside from a couple that are just wrappers around some built-in features like HomeKit). And it's certainly more flexible than a Rule where you can only turn the logging it does offer on or off. :slight_smile:

2 Likes

OK, maybe the debugging is not that bad as I am thinking.
Now I have another dilemma. Of course for the learning curve I have to start with something very simple. But than what? I cannot replace all my RM rules (well too many) with custom apps. Replacing rules which are failing more often? But basically every rule with "Wait for Expression" statements are prone for failure. Since this failure is very random statistically rules which used more often will fail more often. Creating very flexible custom app will be nothing more than another RM. At this point i am lost on long term strategy.

More I am thinking about this failure I more convicted the problem is with "Wait for Expression" implementation. This "Wait for Expression" statement is a Combinational Logic but it is heavily relies on the Events (because hub is Event Driven). I am near 100% sure the frequency for the failures depend on logic complexity for the "Wait for Expression" statements. I cannot recall any rule with very simple Expression (just waiting for only one Event/State change) ever failed.
More logic in the Expression requires re-avaluation for every Event when Devices changes its State. And what if 2 (or more) Events happens near at the same time. Depend on implementation details (which is huge unknown) it could be a Race Condition resulting in missing at least one of the near simultaneous Events resulting in rule failure.
Anyway at this time I am not quite ready to dive into creating custom apps.
I like RM (when it works) and I have few ideas for workaround the observed failures.
I will try this first.

UPDATE

Here is a modified failing rule:

Original "Wait for Expression" with Duration statement replaced with a Repeat Loop.
The idea is to enforce the re-evaluation of the Expression.
Rule tested and it works as expected. But of course some time is needed to see if this modification actually fixed the problem.

1 Like

I'm fairly confident that multiple hubs aren't going to help your issue - Rather it's going to make debugging even more complex, given the distributed nature, and tossing in a network layer (with variable delays).

Having seperate hubs for different radio's and LAN rules makes some sense, depending on the size/scale of your various radio meshes, as it allow more actvities to happen concurrently, given more cores, etc.

But you seem focused on event processing, and trying to isolate it to a single location, which isn't going to work, with multiple hubs - My understanding when a device/variable changes state or value, an event is triggered, and if that variable/device is part of hub mesh, the event/change is sent over the network to hubs that have a registered interest in that source device/variable. - Once the network packet comes in with the change, then the event is re-trigger on the receiving hub - And interested/subscribed applications are then triggered, etc. So the event is happening on the source device (location of the "real device"), sent over the network, and then re-triggered on the recieving hub.

I've had similar problems as what your describing, and I also use PBs in RM and required expressions to avoid multiple/concurrent executions. - The thing I've learned is to avoid wait for expresssion/delays, in the internals of rules. - I just use various global Hub Variables to record state, and keep rules very short and flow thru - If I need to wait for somethiing to "finish" an operation/task, in my world that's a seperate rule, with the "wait" as part of the TRIGGER based on one of the hub variables. - I keep the related rule "snippets" together just based on naming conventions. That keeps all the rules from being "blocking" and they quickly clear PB/execute/set PB. - So a given automation task is typically made up of 3-4 small rule segements that coordinate via HVs. - This also allows "external things" (aka Alexa/Google Home") to execute subsets of these rule segements based on verbal commands (aka end a task that's midway in progress, etc.)

Obviously, YMMV, and you can just code apps in groovy as well - But if I have a rule with a delay/wait inside of it (I do have a few, and I don't use PBs in those cases) - I reconsider that approach, and always try to have the "waits" as part of a smaller rules trigger. - Moving to multiple hubs, for this specific reason around RM, is going to matters even more difficult to debug, IMHO (not saying their aren't valid cases for multiple hubs - Deal with LAN stuff, being a large item)

Good Luck.

1 Like

That is my second thoughts how to workaround the observed issue.
Having multiple rules to perform a single task is a bit messy and that is why this one is second in a line. My first one is to try a Repeat Loop (I already posted the modified rule above) for enforcing evaluation of Expression. The modified rule works as expected but who knows how stable it will be over time.

One other minor point, about your example with Bath Occupancy above - I'm sure your understand that would be two rules, in my approach - One to make it occupied, and a seperate rule to mark it un-occupied.

My only other point, is that I've found using HV booleans as MUCH faster to process (a few ms), versus messing with a Virtual Swich (your "BathRoom Occupied") - If you need the VS, for other reasons, that's fine - I would set a HV boolean for occupied, right after clearing the PB, then in my second rule (waiting for motion to stop as the trigger) - I would have a required expression based on the Bathroom Occupied HV = TRUE, to start the waiting for your assortment of lights, fans to clear.

Not saying one way or another SHOULD matter, but I do spend time trying to shave ms off rule executions - Some things like Zwave/Zigbee IO delays can't be helped, but I've had better luck with HV globals for states, versus virtual switches (and it some cases, I've used both, if I want the VSwitch in a dashboard, or visible to Alexa, but I always use the HV for the required expressions.

Just what I found that seems to work much more reliably for me. - short and sweet rules. - And delays and waits, as parts of triggers.

Yes, this is clear. But as I mentioned having multiple rules for single task is messy and much harder to maintain over time because many little details evaporating very quickly and you will have to open and observe all the related rules at the same time. For this specific reason I am trying to keep all logic in a single rule. Otherwise multiple Trigger-based rules seem to be far more solid solution. Let me see how my "fix" (using a Repeat Loop) will work over time.
So far so good but sure, it is too soon to claim a victory.
But I am sure, TRIGGER-based waiting for anything is far more stable.
And yes, my preference is Boolean Variable vs Virtual Switch.

I didn't say to write one large app that handles all possible use cases. :slight_smile: As with rules, I think multiple smaller apps, possibly one for each automation, would be easier to manage.

If you concern is code re-use/duplication (perhaps many automations are similar but not identical in logic), be sure to check out the "libraries" feature.

1 Like

Certainly and just in case I will try to create a custom app. For the beginning as simple as possible.
However by collecting all available info and doing a "brain storming" I guess, I got a pretty much clear picture why some RM rules are failing and for now will stick with my very lowly (when it works) RM. Here is an observation summary:

  • Rule fails if it has 'Wait for Expression" statement in the body.
    According to the logs all rule's failure happens around "Wait for Expression" statement which is always a very last entry in the log. The failure condition is: Statement become true but rule fails to wakeup and continue executions to the end.
  • Probability for the failure goes higher proportionally to the number of involved devices and logic complexity. I do not remember if very simple "Wait for Expression" (just waiting for a single device or variable state change) ever failed.
  • Protecting "Wait for Expression" statement with Timeout or Duration does not help.
  • If rule is not using a PB for protecting from re-triggering it becomes self-healing because next Trigger Event sort of "fixes' otherwise staled rule.

Taking in a account all the above observations I came up with at least two possible workarounds this phenomena. One is replacing "Wait for Expression" statement with periodic Repeat Loop. I tested this workaround and so far so good (more time still is needed to claim a victory). The down side of this approach is that a desired delay timeout is somewhere between 0 and whatever repeat period but it does not require a complimentary rule.
Another workaround (also suggested by @GuyMan) will be to use a complimentary Trigger-based rule to control a Boolean variable which will be used in the 'Wait" statement in a main rule. Trigger-based rules are very robust but downside is a necessity for multiple rules for performing a single task. I did not test this second approach because I prefer to keep all the related logic in a single rule.

Finally, I hope the HE engineers will pay attention on this existing and very annoying problem reported by few (advanced) users and will finally fix this issue in a right way. There may not be enough very solid logs, etc. and problem is not easy reproducible but there is more than enough info from multiple users report this or very similar problem for the "brain storming" debugging approach for the HE engineers who definitely knows a lot more of implementation details.

2 Likes

This isn't completely true, and there is a way of using a trigger based approach within a single rule. For your example, you just need two conditional triggers:

1.) Any motion active (with virtual occupancy switch off)
2.) All motion inactive and stays for 1 minute (with virtual occupancy switch on)

From there, you just need to use conditional logic in the actions:

If any motion active, then turn on virtual switch plus your other actions
Else, turn off the virtual switch and your other actions

In this scenario, you can define the triggers, but use the conditionals to help limit the rule to only running one time in specific scenarios. Everything is defined, there is no need for a Private Boolean (though you could use that instead of the virtual switch), and it avoids whatever wait issues you are having.

2 Likes

Just to let you know, I was a user who actually requested a "Conditional Triggers".
Thanks, this was implemented and they are really very helpful.
As you can imagine, I know very well what it is and how to use them.
But unfortunately the way how 'Conditional Triggers" was implemented they have a lot
of not intuitive restrictions leading to use a multiple rules instead of one single or creating a really big set of conditional triggers (which approach to use is an individual preferences).

I find it absolutely fascinating that your response has nothing to do with the suggestion. Instead, it is a combination of self-congratulatory and criticism of an implementation. It really is quite impressive.

Now that I have said that (and I do apologize forum mods if I'm skirting a little to close to the line), you have an answer that will work. This thread has delved into the exact same thread that happened last time this rule was posted. Bruce had to shut it down because it ended up going nowhere. If you are having problems with the wait for expression, then either write your own app or rethink the rule design. If you do not want to do either of this and are insistent on using the rule that you created, then use Zone Motion Controller to create a motion aggregation virtual sensor. This way, your rule only has to wait for the virtual device to have an event rather than an expression.

Unfortunately this is absolutely true.
The problem does exist and reported by few more users. The nature for this problem is very random and cannot be easy reproducible. If it was easy reproducible the fixing it would be just a mechanical job. Unfraternally the HE staff simply ignores the existing issue. Ignoring even a very minor issue simply creates a time bomb for the future. I am trying to find a workaround for the problem and simply shearing my ideas. Replacing potentially failing "Wait for Expression" statement with a Repeat Loop so far looks good and I believe this is very reasonable workaround (of course more testing is required). The reason why I think it is a "fix" because the Expression actually becomes TRUE but rule fails to wakeup. Using a Repeat Loop should wakeup a rule on whatever iteration and so far works very well.

I am really sorry you feel it this way.
Some criticism is not a bad idea and may lead to some progress.
But I am surprised you feel like I am self-congratulatory.
Maybe this is because English is not my native language, sorry about this.
My attempt (and not only my) to point staff to pay attention on the problem failed.

PS.
I guess, this discussion is over and thread could be closed

I agree with this recommendation of using the inbuilt Zone Motion Controllers app. You have several options to determine when a set of motions is inactive and then you use that mzone device in your rule. You won’t need all the wait for expression code. I have had several motion zones setup in my house for years and they work great.

1 Like

This may address an issue for the MS-based rules/automations.
However problem with "Wait for Expression" statement is not related to only MS Devises. Unfortunately any rule with this statement may fail randomly (and it does happen). I am looking for a generic fix or stable workaround for this issue.

I had one of the 'wait for expression' rules on my non-radio C-7 hub fail last night after running successfully several times earlier in the day.

This rule had required expressions but no PB,. Every time it runs, its own actions cause the required expressions to go false, so that is functionally the same as how a PB generally is used. If the required expression going false was the issue, then I would expect it to fail every time unless there's some sort of race condition going on. Instead the failures have been intermittent and it runs correctly more often than not.

To describe the rule in general terms:

Its required expression is: device "A" custom attribute <> 100
Its first action is to set that attribute to 100.
The next action begins a 'wait for expression' that has 3 conditions:

(1) Device "B" custom attribute = false,
(2) Device "C" custom attribute <> a certain string, and
(3) Power meter "D" is below a certain value and stays that way for 1 minute.

After the 'wait for expression' is true, the rule sets the device "A" custom attribute to a value (less than 100) that comes from a hub variable.

In the failure yesterday, with all logging turned on, the rule never logged anything after the 'wait for expression' and the rule never set the custom attribute on device "A" back to the correct value. When it runs successfully, it will start logging alternating "Duration event..." and "Expression now false..." log entries until it logs "Expression now true..." and "Wait over: duration"

Using the advice given earlier in this thread, I've added a second rule and a hub variable to let me simplify the 'wait for expression' and hopefully work around this issue. My new helper rule has several required expressions and one trigger which taken together, replicate the 'wait for expression' of the main rule.

The new helper rule required expression is:
(1) device "B" custom attribute = false AND
(2) device "C" custom attribute <> a certain string

The trigger is: Power meter "D" is below a certain value and stays for 1 minute.

The only action of this rule is to set my new hub variable to 'false'.

To keep the rule from subscribing to the power meter all the time, I added an additional required expression element that the new hub variable must be 'true', so this rule will not run until the main rule sets that variable to true.

I modified the original rule to make it's first action 'set the new hub variable to true'. I modified the 'wait for expression' so it is waiting for only one thing... that hub variable to be false.

To explain the new rules as concisely as possible: The main rule sets the hub variable to true. This activates the helper rule which triggers once the conditions have been met. The helper rule sets the hub variable back to false. This satisfies the main rule's 'wait for expression' action. The main rule should then finish.

I'm hoping that with the main rule now only waiting for a single expression element, and with that expression element completely separated from its required expression, I'll see this run reliably.

If, however, the actual issue is with the required expression going false and that cancelling any 'wait for expression' actions, then I would expect this to fail just as often as before, It will probably take anywhere from a few days to a few weeks of running without failure for me to be comfortable that the changes made a difference.

This is exactly what is happening for all observed failures.
Very lats entry is the log is "Wait for Expression" and when rule is failing it never wakes up even after Expression becomes TRUE. This behavior is 100%+ reparative but failure itself is very random.

I’m going to apologize for my previous statement. I read your response differently than was intended.

As for the issue, I’ve got two additional thoughts. First, how does your rule work if you changed the expression to just a wait for event? I find a wait for event across multiple motion sensors works better than a wait for expression.

The other idea is to breakout the motion sensors and virtual switch into Room Lighting. Then just have a rule that sets the variables based on the switch. I know you like one larger rule, but this direction would allow you troubleshoot two different actions: on/off for switch and variable settings.

1 Like

No problem and Thank You for the suggestions.
As we speak I am testing a Repeat Loop as a replacement for the "Wait for Expression" (the example is in the post 17). So far so good but more time/testing is need to claim a victory.

I don't have much to add to this, other than my approach. To ensure "important" actions really happen I made logic to do it for me. It is node-red based though, not RM based, as I do all my logic in node-red - so that likely doesn't help you.

I also wish there were an optional feature to have RM or other in-box Hubitat logic verify the action, but there in't and the staff haven't seemed interested in doing that in past conversations (due to complexity and time required - which I totally understand).

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.