Reliability Has Been Super

I think so. So Hubitat should directly encourage this style of rule creation.

Didn't think of that. But I agree with you.

Either that - or improve the robustness of simultaneous database accesses.

It might be a helpful tip for people. When dealing with a small system it's often safer and faster to process many small things than trying to slurp in one giant thing and execute it.....

I thought I read recently there was something being done in this area? Maybe? If not improved concurrency then improved locking to prevent corruption so easily.

1 Like

Maybe in a private conversation? I haven't seen anything posted on the community forum yet.

Going back to this. This is very true of any complex system. So would this be a user educational issue? Missing safe guard? or at the least a big warning of Danger you may break things?

1 Like

No. Maybe it was a discussion about the need for improved DB and I mistook it that something was being done.

1 Like

Maybe, maybe not.

For instance, I had motion lighting, which is a kissing cousin to RM, occasionally causing massive floods on my zwave network.

There was nothing special (or wrongly done) in the 2 rules I had in motion lighting - just occasionally on OFF commands it would get in a spiral where it would send 3-5 off/on events PER SECOND to the light it was controlling for no obvious reason, until it exhausted the buffer and starting puking errors all over the place in the logs.

So even in times when the rules are perfect, there can be issues in the system. You may be right, and I may be wrong, but my experience does not support discounting issues in the tools.

And yes, I am 100% positive it was motion lighting causing the issue - no other app or rule used the motion sensor or light involved. Not even a dashboard. And it has never happened again after removing the 2 motion lighting rules.

1 Like

That was my problem. An arbitrary decision that fewer rules with more complexity was better than more rules with less complexity ......

There are lots of posts like this scattered everywhere with best practices. I don't recall seeing this in the official documentation.

There should be best practices and cautions throughout the documentation. Things like this: "If you set power reporting to 1% change, AND 1W change, you will overload the network. Use these settings wisely. It is suggested to use XYZ and tune to get no more than 1 report per second"

Same for rules, I don't think it explicitly says "use smaller rules, rules are free" anywhere, but that has been the recommendation on this board since I have been here, about a year now.

I still say they need a good technical writer to pore over the documents. There is tons of good stuff in the Rule 4.0 thread that has never made it into the documentation, for example. It is hard to sift through those hundreds of posts to find best practices.

2 Likes

I remember your describing that several months ago and my thinking, "How the hell could that happen?". Given your background and skillset, I believe you completely, reiterating this point:

I've seen messages that stranded and ghost z-wave devices should be "cleared" out during the nightly cleanup job. Didn't happen on my HE - so clearly conditions exist where the built-in tools don't function as anticipated.

1 Like

Or create more "recommended practices" videos. Either way, that investment will definitely pay itself back by reducing the load on support.

Me too. I still don't know why it did it.

I got it down to just 1 motion lighting rule, and could still get it to do it a few times a week - although not 100% predictable / on-demand. I deleted and recreated the very simple motion=on, delay off rule - still happened one more time.

At that point I bailed out and moved the logic to Node-RED. I can't see the Motion Lighting, or RM, code so have zero chance to troubleshoot/figure out the cause. And I didn't have infinite time to go back and forth with support trying to figure it out.

Never removed/re-paired any of the devices involved, yet it has never happened again once I retired Motion Lighting and RM....

I get that maybe you can drive yourself into the ground in RM with complicated/poorly formed rules... But in MOTION LIGHTING? If you can make a rule in there to kill the hub (accidentally or on purpose), then Horacio, something is rotten in the state of Denmark.

My GUESS is that the issue I saw was related to using GE Motion Dimmers, and something wonky with having the motion and level% coming from the same device.

2 Likes

I've seen this over and over and heard it verbally on the video casts... not sure why it's not documented

Do you use RM or built in apps extensively? Or your own apps?

But in any case. My reliability of Hubitat is basically PERFECT now. No scheduled reboots needed, no slowdowns, no issues.

(of course it is basically I/O only and essentially no apps other than MakerAPI and Life360)

:crossed_fingers: even my Xiaomi Aqara/Mijia HE has been rock stable.

I've been wondering if the hub missing (or not responding) to something like a "heartbeat" on account of being bogged down increases the probability that these devices fall off .....

Hmm....

It's not a "Hmm" to me any more. :slight_smile:

I removed apps/logic, didn't change any devices, didn't change any drivers. Problem gone.

We can argue on whether my logic was bad or the underlying tools have issues, but the end result is the same. I am 100% positive it was something going on in motion lighting, RM, and/or HubConnect that was killing my hub. And I am fairly certain it was not HubConnect based on my removal testing / sequence that I removed things.

Why did those apps cause issues on my hub? Well, I'll leave that to the academics. I can't see Hubitat's code, so I can't begin to understand why. Might not understand if I could see the code either - RM is quite complex and there are no debugging tools.

3 Likes

And can i assume your motion lighting speed is also much faster ?

Really about the same. It was plenty fast in the Motion Lighting app - faster than it was in RM.