Hubs hang after a few months' operation

I have two C-7 hubs: one at home and one at our mountain cabin. Both have Aeotec temperature sensors and are configured to report temperature and humidity readings to a web application I set up.

Six weeks or so ago, the hub at home stopped responding (wasn't running any rules, wasn't reporting temperature readings, and the Maker API wasn't responding). I power cycled the unit and it has been working fine since.

A few days ago, the hub at the cabin stopped responding in a similar way. Initially it responded very slowly to some remote administration commands, but I was not able to get it to reboot successfully. Fortunately, I'm going to the cabin soon and can power cycle it (assuming that's what it needs).

This latest hang concerns me, because the cabin is often not accessible in the winter and we use the Hubitat to control a heater to keep the pipes from freezing.

The home C-7 had been working for quite some time before the recent hang. I had more recently added more temperature sensors (Aeotec AerQ) and started using the Maker API feature to POST device events to an external URL, so it's possible that the hangs were prompted by one of those things.

The nature of these hangs makes me think there may be a "memory leak" related to the new things I'm doing, and that the hub eventually runs out of memory. Is there a way to monitor the free memory on the hub while it is working? It would at least be a workaround for me to monitor memory usage and if necessary reboot the hub from time to time.

-Jim

I personally think that's a mistake.

You can monitor memory with the excellent Hub Info app.
Or, you can do what I did yesterday, only instead of using another hub to do the power cycle, just reboot on the main hub:

1 Like

This kind of dependency on the hub is normally frowned upon. These are intended to simplify your life, but when you need something that is 100% operational 24/7 software glitches can cause issues.

If you have pulled the power from these devices consider doing a soft reset to ensure nothing in the database got corrupted.

Like most technology devices memory consumption will occur over time no mater what is done. No code is 100% perfect so every device will eventually need a restart. You may want to consider putting in some kind of scheduled clean reboot to ensure that the device is returned to a clean state periodically.

You may want to consider using Hubinfo driver as @velvetfoot stated to put hub event data into a DB so it can be visualized. You may find over that 6 week perioid the memory gradually drops down to around 150mb which is where you like started to have slowdown ane eventual hanging. The tools used for this are either Webcore on the hub with Hubigraphs, or Influxdb and Grafana which can be cloud based or local depending on how much you wan to setup.

You may want to consider a USB Smart Switch so you can put that between the power adapter and the hub. That would allow you under dire circumstances to reboot the hub remotely when you are not there as long as you have internet access.

@user4390
I had the same thing happen to one of my clients.
It turned out that the amount of Free Memory had dwindled over time, and so conked out.
Power cycling will restore much of the free memory.
Also, if you want to make sure that it never happens again, you can install a driver from Hubitat Package Manager called Hub Information. It contains an attribute called free memory. It also allows you to reboot directly from that device.
If you want, I will also show you the code to do that automatically once in gets into a low memory position.

What is that? :slight_smile:

I wonder if this is Hubitat's position as well. I suspect their lawyers wouldn't want them having any sort of liability if damage occurred as a result of a hub failure, but I don't think they would want to say that their product has an MTBF of only a few weeks.

Memory leaks aren't inevitable. I used to work for a large maker of internet backbone routers; imagine if they needed to be rebooted periodically to avoid having parts of the internet lose connectivity. We took memory leaks very seriously.

Thanks to you and others for the pointer to the HubInfo app; I wasn't aware of it. I just installed it on my home hub (370280K free currently) and will keep an eye on memory usage.

I'm thinking about that. Ironically, the Hubitat hub is taking the place of a manual WiFi switch that used to control the heater. That switch lost connectivity to my app (probably due to a software update), and the manufacturer's tech support told me to re-pair the switch. With the switch 100 miles away and snow several feet deep, that was not a helpful suggestion! My home Hubitat hub had been very reliable up to that point, which prompted me to install one at the cabin.

Thanks. Is it as simple as adding the Hub Information driver to the list of devices used by the Maker API app, and then POST requests will be generated for free memory changes (in addition to my existing temperature/humidity info)? Or do you have something that will locally cause a reboot?

They have said multiple different times that it should not be used for anything safety, life, or property critical, so yes I think that is their position as well. But I don't work for them, I'm just commenting on past feedback.

Did your routers have a development team of <10 people and cost $149? Your expectations may not be reasonable for this price point/target market.

Anyway, I'm not here to argue. But I would say your use case is VERY ill advised with this product (and any other commercial hub of this kind). You do you, though.

2 Likes

The HubInfo driver has the ability to trigger a reboot. You would just create a rule to run when freemem drops below a threshold.

My rule runs a 3am so it and checks the value. If it is below a threshold it triggers a reboot with the driver.

Here is my rule for rebooting, (with thanks to @thebearmay ):

The author of Hub Information (@thebearmay ), made it very easy - not only can you tell how much free memory there is from that one device, but you can reboot the hub directly from it, as well!

1 Like

Here is mine. I like this because it does a single check at a specified time. This could prevent added stress on the hub at a time other system intensive task could hit. The previous rule will check every collection interval during the time window that is good for the required expression. If you have it collect memory frequently that could possibly cause allot of extra load.

1 Like

I agree they are not, but the more conplex and multiuse the environment the more likely they are going to occur. As other have pointed out there is a huge difference between core internet routers and a $149 home automation hub. I would bet those routers were running machine code ib a cystom OS instead of java in a linux distro for ARM. I tend to think most of the memory issues are related to JAVA and how much we through at these hubs.

Close enough: custom OS but written in C.

Anyway, thanks for everyone's help in working around memory exhaustion. I'm rather surprised that there is this much awareness of the problem and many people who have dealt with it -- I would have thought that Hubitat would have at least made the problem less prominent.

This is true for the reasons mentioned. I really like my Hubitat and it's far better than my old SmartThings hub, but an honest review of mentions of "issues" on this forum will show that hubs becoming slow, unresponsive, flaky etc. due to memory leaks is a very long standing problem. This really needs to be taken more seriously by the developers and YES we need a deluxe hub option with the ability to add more memory. Seriously, storage is the cheapest it's ever been and running on the bare minimum is silly at this point. Please HE developers, give this more attention and/or at least give us some tools to be able to see that amount of memory each App/Device is using. Not, the CPU utilization, the memory utilization.

Evidence it's not a small problem:

1 Like

Hi,

This topic had me look at my hubs. They where not slow or unresponsive (yet). I log the information in InfluxDB. The C7 freememory flattened out at 110K and I started seeing a lot of excessive load messages in the log. Similar for the C8 at 170K. Both after running for about 75 days. So I now added a reboot if the memory drops below 120 or 180K respectively and see what happens.

Cheers Rene.

@user4390 I would use @thebearmay 's excellent hub information driver., Write a rule that if the hub falls to <120000 of memory that the hub reboots and after reboot using a hub event rule sends a message to your phone that the hub rebooted. This would confirm what's going on.

I would also do a soft reset to confirm a clean database.

2 Likes

They have. It happens a A LOT less than it used to. :man_shrugging:

2 Likes

Maybe I'll revise my rule down to 120k.

Or maybe I'll stick with 175k :slight_smile:

For you people doing your rebooting stuff around 3AM, check out your daily backup timeframe. Wouldn't want a conflict.

2 Likes