Look away - I'm just frustrated

Hey @jshimota, I feel for you. Iā€™ve had so many issues over the last couple of years so I can really relate. I tried everything and still had persistent issues. Eventually things settled down a lot when I moved all my lamps onto Hue, added loads of repeaters, and moved all time critical rules to Node-Red. It was a stack of work. I clearly had mesh issues. Still now and then I find myself walking into a room and the lights donā€™t come on, or rules go haywire, the database size goes crazy, the free memory drops below where I know itā€™s going to cause an issue and my auto reboot rule fires and on and on. Iā€™ve tried HA and it was also a stack of hassle. Whatever else people say, itā€™s clear to me this technology is hobbyist right now. It will progressively improve Iā€™m sure of that. This platform is hugely improved over the last 2 or 3 years. And Matter should improve the device side and connectivity too. Hang in there buddy :smile:

5 Likes

So I've been using an app called Rebooter and I was actually restarting the Hubitat process instead of rebooting the hub itself. Is there any benefit to restarting the process or is that not necessary as well?

It may have been fixed now, but just restarting the process was causing issues at one point....

1 Like

To add to what @jtp10181 said, scheduled reboots have been and still are a band-aid to a larger problem that should be addressed in the first place.

3 Likes

wow. I'm not sure what to say Gentlemen! (and ladies if present ...)

My post was more a vent than a call for help - I freely admit that. I know that my voice has power when I post in the Social Media world - and I don't want negative impact on HE without cause hence my vent here. FWTW.

My current issue - current is the operative - is rebooting. I watch everything all the power users say and when they talked about rebooting schedules a few months back due to DB size issues I took note. I was NOT having the problem, but I was trying to get ahead of the curve and did implement a 3x a week scheduled reboot during the day. Someone asked why during the day - and for me, I think it's obvious. So I can resolve the impact issues.
I will freely admit - I have not had DB size issues, 1 corrupted DB issue which was self-inflicted as I was writing a driver. I never have heat or mem issues either.
Dumping all the zwave devices has stopped (I'm a c5, single hub) was the second most positive impact I've made next to the UPS.
I'm utterly grateful and I voice that to all that help me - and the community has been incredible. I've not needed HE support staff but thats because of the community they have begotten.
At this time, I'm removing the reboot function. I'll monitor as always and see how long things stay functional. I appreciate the offer of some fine person to put a second pair of eyes - if it came to that, I'm open but I'm not really there yet. I dumped a lot - and a lot of it is historical so these aren't current issues.

I want to be a bit clearer - I lashed out at HE and I still, in a calm way, feel there is a foundational issue in how devices are initialized and refreshed after a reboot. It seems to me that all the devices should be communicated to before the green light to establish state, function and availability regardless of impact. No working device should get lost because the hub reboots. The function of Device Watchdog should be a hub staple and not an addon.

Once again - Look away! don't let my discomfort take away from the fact I've enjoyed the fruits of the labor so many of you have served. Thanks for the empathy and compassion.

I don't think you've thought that one all the way through and now that you have calmed :slight_smile: perhaps you'll have some time to carry to the end of the thought. It turns out this has been discussed too.. and every one of our hubs is unique in its use cases. Power fail recovery needs to be a per hub solution, I believe. Yea, there's probably 2% common stuff, but if I have to do the other 98% I can fit that 2% pretty easily.

1 Like

This was a great unemotional/non-confrontational post that should be the makings of a sticky somewhere.

As to @jshimota 's original post.... In my mind it is another compilation of thought and user perspective that should be seen to underline the useful health check/tool-scape folks could benefit in having to discover what the hell is wrong/outta wack within their environment. Similar to what has been asked in threads dealing with the nemesis of device disappearance (aka: battery death) or a device topology overview that would map or bring attention to device radio signal reception/quality concerns.

I have opted not to layer too much of the kind of complexity that could be layered within the HE space (to frequent consternation I might add, e.g. "hey dude, you can do that easy by just adding ANOTHER Rule in RM (version N+1) ...to watch/'tidy up' after the built-in App (HSM etc), or instead just use the XYZ App written by Developer ABC who no longer supports it," etc. etc., etc.)

I gotta reel in here, this is jshimota's rant thread not mine :crazy_face: And I'm not talking about the HE hardware here.

My point being....Keeping It Simple is key for many that don't want this to be a full time hobby.

4 Likes

This has been discussed in other threads and the response from staff is that it is very difficult to do correctly, more so than most people recognize I suppose.

To implement an unreliable watchdog solution could be worse than not having one at all.

2 Likes

I will echo what @csteele says here. I once believed the way you do, but, after I thought it through following a dismissive post from Hubitat staff, I realized that they were correct and I was wrong.

First, no battery-powered devices can even be communicated to after a reboot (or at any other time). Either you physically poke the device (put a paper clip through some hole on the device to hit a switch, or hit a tamper switch) to wake it up, or else you have to wait on the next battery report at the interval that has been set on the device. That could be just once a day - do you want to wait a day after reboot for the green light to come on when that device finally reports?

Second, and more importantly, the only real way to handle reboots, in my opinion, is to have a (perhaps elaborate) rule that is run on systemStart. I have had one of the Hubitat staff be dismissive of this, saying that they just go around and poke the necessary devices and modes, etc., to get everything back after a reboot. That's not acceptable at our house because my wife refuses to open the Hubitat app (or a dashboard or the Hubitat admin page) and get things into the "right state". So, I spent the time to write a fairly elaborate systemStart rule to get things in the right state after reboot. The only time I reboot is after a firmware update, or after a power failure, in which case the hub, on a UPS, shuts itself down safely (trigger events from Ring Alarm Extender v2 devices, with a voting rule to decide that enough of them have failed) and will restart itself after power comes back by an automated power cycle plug for the Hubitat on the UPS.

4 Likes

:thinking: :thinking: :thinking:

So I guess I did it wrong, since it wasn't that hard. :wink:

7 Likes

This is only categorically true for Z-Wave battery devices.
Most Zigbee ZHA 1.2 and 3 battery devices check their parent router for any pending commands at least every 7 seconds.

I don't understand the need to run around and poke devices and or dink with anything on the hub to get things right during a system update...
A long term power outage, sure, modes could be off...
I'm on my third automated home and I've just not needed to do this.

2 Likes

Struggling here to not defend my position.
We (I include myself) will go out of our way for a single case problem - 100's of apps in the field and if one of my webapps has a bug for a single person, I fix it.
But 2% is okay? And BRians apps is just the current benchmark. the fact it exists is a testimony to my point - that so many folks experience issues tools are necessary to track them down. I'm not saying Brians app gets rolled up into HE, we all know about that. I'm saying a tool that is embedded in the HE reports back upon boot 'here's what I see/saw/can't find'.
One of the areas I didn't even mention is devices that are having drivers written that don't meet a certain spec or requirement. Case in point, the driver I created for eWeLink because it (and I still don't understand this) didn't refresh right or properly or whatever he said. So it got tricked out to use presence and WD to sense availability.
And then why is there 'Metering'? And why no guidelines and defaults? Does adding devices work or not? (retorical). When Dr. Salwen and I were working on Ethernet protocols - Csma/CD (collision avoidance/collision detection) came into be so commands didn't get lost.
We're all just islands unto ourselves? that makes no sense. A lot of the responses here are good thinking but telling me I haven't thought it through... um. actually. I did. I'm the one who suggests theres an underlying rift between moving the ball forward and making sure the ball doesn't deflate while I'm kicking it. (okay - poor analogy).
I'm really trying hard to understand why thinking - a good ol' american trait - is being quashed for expediency. If there were no problems - then why is the FB group inundated with simple 'my light bulb won't go on' posts? argh. this is insane - I feel like I'm arguing with myself.

1 Like

Ah, I stand corrected. No Zigbee here.

Um, you need to have a conversation with my wife.

Well, there is no way to differentiate long-term power fail from a firmware update reboot. Perhaps a Location Event needs to be added for firmware update reboot?

And I did have one firmware update that crossed the sunset boundary. My systemStart routine caused everything to come back as it should. Some of us really donā€™t like fooling around with the hub, resetting things. Itā€™s a home automation appliance. Now that everything is set up, I only touch it when a meaningful firmware update arrives. Even then, itā€™s just backup, update, backup.

okay - I swear. last post - back to the metering thingy.
Since routes are readable - what are we metering? Should that be a Hub requirement to handle? Why tell the driver the metering? it's a calculable value -and even if there are secondary effects, they took can be taken into consideration. A device declared to HE should be dynamically polled so as to gain a modicum of throughput to the device then commands issues based on distance... just saying.
Okay. I'm going bowling to get rid of some angst.

sure there is, you have control over the latter, but not the former...

Iā€™m not in a position to judge the assessment that the hub shouldnā€™t be in the business of (accurately) evaluating device health. I donā€™t work in the IT field.

But based on their track record overall, I do believe that if this had an easy and/or highly functional solution, they probably would have done it already.

Nevertheless, I still use @bptworldā€™s device watchdog anyway :wink:.

3 Likes

But how many questions do you get from users asking "how do I know how long to put for the 'time since last activity' thing"? :smiley:

I think Hubitat could do a little bit more here since for many Z-Wave devices (battery ones, also the most likely to be of concern here) there is a defined "wake up interval." For Zigbee devices, most also respond to configuration that might include time-based reporting. The driver should know both of these, and so if that were able to be better reported for use by apps (or the platform itself), there would at least be a chance of doing something useful with it without asking so much of the user. But not all devices fit into these categories; for some, you just really can't know for sure on the hub side.

I've also used a few similar platforms and haven't seen anything fantastic in this regard there, either. And as someone who remembers the frequent complaints of Device Health on ST (one of those platforms, but by no means the most recent one I'm talking about), I don't blame anyone for not wanting to implement something like this natively--even if some devices might have something that could help us out a little bit.

1 Like

I do too. It was essentially useless, IIRC.

2 Likes

+1 this

SmartThings is still working on it years later.

I donā€™t either. Also, am I not correct in believing that the radios only power off with a shutdown, not with a reboot?

I must admit that the best thing I ever did, besides moving the HE to the center of my house (not in a closet), was adding a second HE hub for only Zigbee lights and a few GE Zigbee dimmers (as far as Zigbee goes that is). All of my battery powered Zigbee devices (70 or so) are on a second HE with 5 Samsung plugs, 3 Sylvania plugs, and a few GE Zigbee dimmers (in-wall). This setup has been rock solid for almost two years now. There have been some software hiccups occasionally, but nothing that wasnā€™t fixed by a reboot, or backup, soft reset, and restore. I also found that I didnā€™t need to reboot (coincided with removing Peanut plugs), and the speed actually improved after a couple days of not rebooting as the mesh became stronger.
I was very frustrated very often prior to that and understand people that have the same frustration of things not working as they should. I had the same problems on SmartThings V2 prior to HE with devices disconnected from the mesh, unresponsive, etc.