First - excuse me for the noise. @jwetzel1492 (the author of this app) never replied to my few related posts. So, I am not positive this app is still supported.
Except for this app mesh is rock solid. The app has built-in loop prevention mechanism wich is time sensitive. Without mesh this app is far more stable but still fails occasionally. Therfore this problem is not easy reproducible. Mesh obviously adds some extra delays and the problem becomes more pronounced. Most likely timeout(s) needs some tweaks.
Did I make my theory clearer?
BTW. There's nothing useful in a logs. When everything is OK the event is logged for each participating device plus the entry that loop prevention logic is invoked. When synchronization fails the correspondent event is simply missing. That's it. So, even though the problem could be reproduced more often the logs "as is" will not provide any clue what is not right.
If your devices typically respond quickly I would reduce the Estimated Wait Time setting lower. Default is 5 seconds, which I think if you take multiple actions on the devices within that 5 seconds it could cause confusion within the app.
Also if there is a master device, you could set the master and enable the polling option to keep both in sync with the "Master"
Thanks for the advice, I will try this. But my good guess, this timeout is a protection against non responsive devices so the app will not wait forever for theresponse which never will come. If I am right the app behavior will not change. In my case one of the device is Virtual which is very fast and always responsive. The second device is physical but also reasonably fast and sure, responsive.
Polling is very undesirable option.
Frankly I am using this app to deal with all that many child devices (brought from HA to HE via HADB) which are real headache when you need to swap them in multiple rules.
It says what it is for right in the app, and its not that.
WARNING: Only adjust Estimated Switch Response Time if you know what you are doing! Some dimmers don't report their new status until after they have slowly dimmed. The app uses this estimated duration to make sure that the bound switches don't infinitely trigger each other. Only reduce this value if you are using very fast switches, and you regularly physically toggle 2 (or more) of them right after each other (not a common case).
I'm still supporting it. However, Hubitat apps are not my day job. I have other responsibilities, and it has been the holidays. I haven't checked in on this forum in a while.
The next step would be for you to post your logs, with some commentary on what you believe they mean.
Thanks!
Edit: This app was written before Hub Mesh existed. Having 2 syncing processes that have to work together is certainly going to be more challenging, just from a Computer Science perspective. More things that can go wrong. Could you make Switch Bindings operate on devices on just a single hub, and then use Hub Mesh on the result? That is likely to be more reliable than having Switch Bindings operate against devices that aren't on its same hub.
I am glad to hear the app is still supporting. I clearly understood this is not your primary assignment and sure, Holydays is a very nice time just to relax.
With a new GUI (C8 Pro hub is updated to the latest platform 2.4.0.146) I don't even see an option(s) for enabling/disabling logs. Neither for the main app nor for the child. But with a previous platform (2.3.9.201) version I did capture logs but they were not really useful. If everything was OK logs shoved two events (one for each participating device) before showing something "starting a loop prevention" (or so). However when synchronization failed one event simply was missing. That was it.
Now with new GUI I don't know how to even enable logs.
In my setup everything (all automations, zigbee and zwave mesh) are running on main C8 Pro hub. However I am running Home Assistant (literally only as Assistant) to bring in bunch of devices which are not supported by HE. To do this I am running HADB (Home Assistant Device Bridge) on C7 hub which is hub-messhed to the main C8 Pro. Because HABD brings in all devices as a child (these child devices is a huge headache to deal with) I am using Switch Binding app for mapping remote devices from HA with local devices on HE. Frankly I can use all that remote devices directly but as I already mentioned this is not desired.
Sure, Switch Binding app far more stable (but still fails occasionally) if all participating devices are on the same hub. Hub-mesh definitely adds extra delays and failure becomes more pronounced. I did not try to dig through the app code and sure, I could be wrong but my good guess it is some sort of timeouts related to the responsiveness vs loop prevention logic.
BTW, the app is very nice and REALLY helpful specifically (at least in my case) for dealing with all that child devices. My main use for this app is to covert child devices into normal devices.
Yes, this is exactly right.
When synch fails the C7 child was/is not updated if C8 device is toggled or C8 child is not updated if HA/C7 device is toggled. HA<-->C7 link is/was always rock solid.
Yes. Decreasing timeout did not make any difference up to certain point (do not remember the exact value but somewhere around 1000-1200mS) when things becomes completely broken. I am keeping a default value 5000mS. Attempt to repeat toggle whatever side within this time frame (5sec) produced unreliable results (expected behavior).
So if the failure happens between the C7 and the C8 mesh device (not final virtual device) then I assume it is failing at the hub mesh link? Or is the disconnect between the Mesh device on the C8 and the virtual device on the C8?
If the failure is at the hub mesh link I am still failing to understand how this app could be having any impact. SB app is just keeping the two devices on the C8 in sync with each other, which it sounds like is working fine unless I am misunderstanding.
App itself does not introduce any impact. It simply fails to sync devices dues to extra delay obviously introduced by hub-mesh. From the other side hub-mesh is rock solid. My setup has many devices brought to main C8 via hub-mesh and HADB + hub-mesh. All this (Thanks God) is rock solid.
If I toggle any hub-messhed devices on either side which are also part of SB app there was not any failure observed. However if the same device is controlled by SB occasionally synch fails. I don't know how to explain this observed phenomena clearer.
Again, I have no idea how "loop prevention logic" works but because in a log "attempt loop prevention" event is very last I assume this is timing based mechanism. Obviously hub-mesh adds extra delays and this could be just a border value.
Just thinking loudly (and this could be already a case).
If I would design something like this I would use an even driven loop prevention mechanism. Here is how. There is always an "initiator" device (the one which is toggled first) in a group. Based on this initiating event the app should send a correspondent commands to all other group members, wait for the events from all members to make sure they responded but simply ignore them until last one is received. For preventing failure due to potentially unresponsive/bad device this should be protected by reasonably large timeout but log should report which device did not respond.
Lets say you switch OFF the virtual device in this setup. The SB app will see that event and then push that change over to bound device (Mesh device on the C8 side in your case). This would be the same as if you were on the device page of that mesh device and clicked the OFF command. Then SB would also see the mesh device was changed to OFF, but if it is within that timeout window it will ignore it to prevent recursion loops. There is really nothing else going on with SB. In a nutshell all it does is run the "OFF" command on the bound device.
In the meantime, Hub Mesh will see the C8 Mesh device has changed to OFF and push that change across to the C7.
So again, the only way I think it is possible this app is causing any of your issues would be if the C8 Mesh device and C8 virtual device are getting out of sync for some reason. That is the only thing SB has any part in. If that is not the case and it is the C7 <> C8 mesh devices that get out of sync then the problem lies solely in Hub Mesh.
You mean the log entry checkForFeedbackLoop: Preventing feedback loop ?
Looking at the code, that will be logged when the bound device changes states within the response timeout window set. So if A switches to OFF, SB pushes that to B, B changes to OFF. Then SB will see that but block that with the above log. Now if B then rapidly changes back to ON again that would also be blocked if the timeout was not reached yet. (This could cause A-B to be out of sync, but this is unrelated to Hub Mesh)
So there is a chance that if device B is rapidly flipping states after A was initially triggered it can cause issues. Thats where setting the timeout lower can fix that issue. However if just A flips back and forth a few times (like someone manually flipping a switch on and off), it should be able to handle that correctly and push each change to B.
Do you possibly have changes coming from both sides fighting each other and butting at the Hub Mesh sync? For example the virtual switch changes state via some automation, and the actual device connected to HA changes to a different state close to the same time? Then as those traverse the bi-directional sync path the two conflicting states butt into each other at Hub Mesh? I am not sure how HM would handle that, maybe it just drops it because it doesn't know who wins?
All your explanations are very logical and very reasonable.
But if I control the Device on HA and/or a correspondent child device on C8 (with the C7 in the middle) this always works and blazing (near instant) fast. However when I try to control Virtual Switch on C8 (which triggers an RM Rule) the rule is triggered but occasionally HA device is out of sync because the C7 middle man lost an event.
In opposite direction when HA Device is toggled the correspondent C8 child is updated but SB bundled VS on C8 is not.
No, this is not a case.
HA does not have any logic at all. It simply fill the gap for the devices which are not unsupported directly by HE. C7 in this case is nothing more than a middle man. It also brings in some Zigbee toys which could not be linked directly to the C8 (Pro).
Everything is happening only on a C8 side.
Everything what is brought to C8 via HADB + hub-mesh are my lowly child devices. If a child device is using only once I am OK to use it directly. But if it is part of more than one RM I am trying not to use child device directly (because it is a huge headache to swap them if needed) and instead attempting to use Virtual Device. This is easy for unidirectional devices such a sensors. But become a challenge if I need to synchronize dimmers/switches in different locations. The SB app is a perfect fit for this cases.
Yes but when it works, it syncs it first then the log entry is blocking a bounce back loop. If it failing as I described above, then I suspect that is what is blocking the change from happening.
Go to line 328 in the child app code (Switch Binding Instance) and uncomment this log line (remove the //).
That will give more info about why it is being blocked in the logs going forward.
This in theory should only happen if Virtual Device (V) changes state (lets say turned OFF), that goes to the C8 Mesh device (M) and it is set to Off. Now right after this a change comes from Hub Mesh and M is changed back to ON. Then that would be blocked from going to V due to the timeout. and those two devices would be out of sync.
I could see this happening due to the multiple layers you have created here. V - OFF goes to M - OFF, this goes to the C7 - Off, which goes to HA - OFF, but some reason the device does not change to OFF, so HA kicks back ON state, C7 Changes to ON, M changes back to ON via Hub Mesh, Then SB blocks this change from V, leaving V OFF.
I think you can "Swap" mesh devices easily via Hub Mesh so this is probably not even needed. Hub Mesh | Hubitat Documentation . Sounds like if you were to change something on the C7, you can link a new physical device to the same mesh device on the C8, effectively doing a swap.
OK,
I will create a clean Test Environment and will try to collect more data (as much as it will be possible).
Gladly, because everything is running on C8 and near all my automations are "true automations" (i.e. human interactions are near zero) this occasional sync failures have near zero impact still keeping WAF very high.
The problem is not related to a meshed devices. The problem is all Child Devices created by HADB. These are not easy swappable (required manual editing of every rule which is using child device). And I have to...o many of these devices coming from HA
(and this list is about only 20% of them):