Strange shut downs and hub behavior

If you use Rootin' Tootin' Self-Rebootin make sure you disable it when you do hub updates. If it triggers an reboot while a firmware update is going baaaaaad things could happen.

1 Like

Any older Z-Wave devices that are chatty or you have a high polling rate set for them?

Would it be better to use the other option, "Rebooter", or is this not specific to the app? I imagine that my reboots would be set to occur at an off hour in the middle fo the night whereas any hub updates I do manually and would not happen at the same time.

You could still have the problem with the scheduled reboot but you are MUCH less likely to have it occur. The problem with RTSR is that the hub upgrade itself could actually trigger a reboot since upgrades unzip and write to storage and generally do heavy, resource intensive things.

You should pause anything that could reboot the hub externally during a firmware update.

2 Likes

All my devices are Z-Wave Plus, BUT I do have an app that requires a high rate of polling for my alarm system which is what I have been keeping my eye on. I have suspected this is the problem but it's been running for some time without issue so I don't know why it would have started to be an issue just recently.

I should add that this is a fairly mission critical integration for me and would need to solve it even if it means moving to another platform. I'm looking into my options.

My issue was older very chatty Z-Wave devices that I made worse by setting the them to poll every second (I was warned and ignored it, so that's totally on me). They were indeed where my problem was. Same as you, I long suspected them, but I liked the use cases with this high frequency polling rate so much, that I was just living with the slowdowns they were causing by doing nightly reboots.

There have been changes to the platform that have improved things, but (and this is an absolute guess here) anecdotally, it seems like earlier versions of the platform were more tolerant to the problem devices. When I was running 2.1.3.x (for around a year) the issue really didn't present itself fast enough for me to draw a conclusion to what the issue was. But when I finally decided I was going to move forward and get up to the latest hub platform version, I was getting the same slowdowns after 24 hours or so that many others have been experiencing. I will say that a total lockup rarely occurred, but things would become so unresponsive, that it seemed like it was locked up.

After moving my Aeon v1 HEMs to a second HE hub, my main hub is now working like a champ. No more slowdowns. I know it was the chatty devices, since the issue of slowdown has now moved to the second hub, which only has those HEMs on it, and when that was previously my test hub, it never slowed down. So I've solved the issue by simply auto rebooting that hub at 4am each day. That's a very cost effective solution for me, since purchasing two brand new Aeon v5 HEMs would be $300 CAD and there is no guarantee that having them refreshing at that rate would not just duplicate the issue on newer hardware.

1 Like

Yes, take care with this app (rootin tootin). I had it get into an endless cycle (even tho' it is supposed to take note of a parameter which controls the number of reboots allowed). This happened while I was away. I had to physically remove the hub from the Internet to stop the reboot loop and then removed the app.

Thought I would report in now that I've been using the Rebooter app for a few weeks. I was working well with 2 reboots a week until a few days ago when I again found it just shut down in the middle of the day so I am now up to 3 reboots a week. Other than that this seems to be what's needed.

Things just seems to continue to get worse. I am seeing yet more instances of the hub being on but just not reachable requiring me to pull the power. This is even occurring in the same day of a scheduled reboot. @bobbyD, I sent an support email 2 days ago. Please let me know what I can do to help you help me.

Is the LED green? Or red?

Blue. Mine doesn't have green but same idea. It's on but not responsive.

Just yesterday I had another slowdown where the hub was reachable but not otherwise able to function as far as I could tell. Anytime I tried to access a device or interact with a dashboard I would get an Error 500. This time I was able to find some info in the logs.

IMG_7812

IMG_7811

IMG_7809

IMG_7808

Do you have anything rebooting your hub automatically? All of those errors are typical of what you would see if you forced the hub to restart in the middle of an operation.

Ok, good to know. I do use the Rebooter app as a response to my hub issues, as recommended in this thread. Thanks for the reply. The errors had me more worried.

So I finally got a support response to this issue and the following was identified...

  1. I have about a dozen Fibaro flood sensors that are apparently connected in secure mode and this can apparently create a lot more overhead for the hub. I had no idea there was even a selectable connection mode let alone how to set it. Seems like this should be preset to non-secure by default, which after reading the thread support pointed me to for further info seems like that was actually done at some point. Clearly not on my hub as I never touched that setting. Didn't even know it existed or where it was till now.

  2. My hub's web server is getting overloaded. I have one device in my house that displays my dashboards via the web link. I'm sorry to see that this one instance is too much. I would happily change it to access the dashboards via the local link if only the local link was HTTPS with a signed certificate. Not yet sure how to address this. I did disconnect this to test it's impact and I do notice a difference so I will have to figure this one out.

Anyway, that where things are.

Yes security adds a little overhead. If these were chatty devices or lots of them doing something like sending energy reports every 30 seconds then yes I'd say this was the cause. At the moment for flood sensors that only send information when they are wet or a normal wake-up to report battery. I'm flagging as a red herring.

This I don't doubt at all.

To each their own but I see little reason to run SSL on a internal dashboard and the SSL again just adds overhead to the webserver processing which is more overhead than a few z-wave devices against the hub CPU.

2 Likes

Looks like they report temps every 30min. I have no idea what their check in schedule is like other than that. Some are plugged in so might be more chatty that the ones on battery.

I am displaying my dashboard in an iframe as part of a larger display of information. Since browsers will not display mixed content I can only use an HTTPS link if I want the master page to be secure. I could access the master page via HTTP but it makes me a bit nervous. Maybe it's not a big deal but it is serving a family calendar and family pictures among other things. I would prefer it to be secure. I don't know enough about how this works but my concern is that by using HTTP on the main page, even though my hub link is a local one there might be somehow related data being sent over the internet back though that server hosting the main page. Maybe not but I have my alarm system, door locks and security cameras tied in.

Yea I get it but I have others in the family who are connecting their devices to this network and it is hard to control what's on their devices. It doesn't take much for someone to get some malware or something and have it on the local network sniffing around. There are plenty of other things to be concerned about if this happens but I don't want to not make every effort I can.

These I would look into.

Oh the fun of those devices. VLAN segmentation and guest networks :slight_smile:

3 Likes

You might consider not using the rebooter app or cutting back on its frequency dramatically. It looks like it was rebooting multiple times in a short span. With that much rebooting there is so much initialization of apps/drivers (and potentially other internal, undocumented out of the box behavior) that it is counter productive and causing additional overhead. After a reboot it takes my hub 10 to 15 minutes for everything to calm down again.

Also, just from a health of the hub perspective... I only force a reboot as a last resort aka things stop responding or things are responding way too slowly (and that happens rarely). You wouldn't force reboot your car's motor, PC or phone because it could corrupt storage or leave a job incomplete that won't know how to resume, etc.

It's clear from your logs that actions are being interrupted by the rebooting and they don't know how to react. That would worry me enough to pause the app if it were my hub.

2 Likes

It's set to reboot once on Monday, Wednesday and Friday at 3am. That's it. These logs did not come from a time close to those reboots either.