[Staff Support Request] Hub Networking Errors

@bobbyD @support_team
I have a C5 running 2.3.5.121. I'm all of a sudden having trouble with my hub and cloud-based apps. Multiple apps can't seem to connect online so I don't think it's an issue with any apps:
image
image

The hub Network Test all passes though without trouble.

All of my non-smart home devices seem able to connect to the internet just fine, so I don;'t think it's a problem with my actual network. I have overridden DNS settings on the hub using the recommended DNS servers, as well as overriden DNS settings on my router (Orbi RBR50) with the same DNS servers. Problem persists.

A hub reboot seems to fix the issue, but only temporarily - it keeps resurfacing. Any suggestions?

My hub is connected via Ethernet, with DHCP and auto reconnect every 60s. I set a static IP for the hub with my router.

I have disabled the 1200 and 1313 apps but the problem persists.

I have a C8 on the way, but really want to get this resolved before migration, as I would rather not propagate the issue to the new hub.

By chance do you have the hub info driver? If you do what is the memory like on the hub when these errors start to happen. If not can you load it and check the status of the hub when these errors start to happen.

Those two out of memory errors are very concerning. How long did you leave it in this state. It looks like the majority of those are Sharptools. Can you try disabling sharptools to see if that will help as well.

Check the App and Device stats tabs on the logs page after the hub has been running for a little bit. Anything consuming a lot of resources should stand out pretty good in there. Could be one custom app or driver that is taking to hub into a downward spiral.

Cross-linking to the original post from a few days ago from the other more generic Get Help category in case there's any clues from there that might help others.

SharpTools by it's nature is likely processing a lot more events than other apps, though with very little processing as the app is a lightweight event forwarder. Since most people who use SharpTools have several dashboards with many (most?) of their devices displayed in some fashion, the app ends up touching a lot of events.

That being said, it's a perfectly reasonable approach to disable apps one-by-one and see if anything causes the errors to stop. :smiley:

1 Like

Right, but something is consuming resources. The fact there is a out of memory error in that screen shot int he first post means something is causing a serious problem. I have only seen that error message pop up when there is a major problem somewhere in the hub with resources.

I use Action tiles myself and that app is similar, it doesn't do anyting but forward the events to their outside service. That said that application is consistently one of the busiest apps on my hub.

I would still like to know what the free memory is from the Hubinfo driver. My guess is we will find that memory is being consumed by something.

1 Like

No errors today, which I guess is good, but I suspect they will recur at some point soon. At the moment, hub info driver shows free memory at 198048

190Mb after 2 days is very low. Takes over a week for my hub to get below 200Mb.

Is the the V2 or new V3 Hub info driver? What is your refresh interval?
Also, what is the 3D Printer device? Is that a integrated via Wifi?

I would disable that 3D printer device and see if that helps.

v2.7.22

Polling rate was the 300 default but I increased to 1200 once I saw it was consuming a lot of resources

Octoprint cloud integration. I will increase the idle polling from 5 minutes to 1 hour to see if that makes a difference

You should update to v3, it has 4 separate polling schedules so you can get the metrics you want regularly and the stuff that rarely changes you push out to update less often. [RELEASE] Hub Information Driver v3

May also want to reboot now, since memory is already very low. Start fresh with the changes and see how quick it drops.

1 Like

what's a good polling interval for free memory?

Here is how mine are setup, I am doing free mem every 12 minutes. I tried to make it so the two most frequent polls would not typically land on top of each other. Also there is no point in going below 5 mins for CPU since the hub only updates the metric every 5 minutes. Some people are doing shorter for mem but I personally don't see a reason for that.

2 Likes

The reason for a shorter interval for free mem is to get visability to shorter interval fluctuations. I have seen occasions were my mem drops dramatically and then recovers within a few min. A 5 min or 10 min interval wouldn't show if that was the cause for a system wvent. I do one min interval, but use node red to collect it.

New errors appeared today:

Device 1641 is flume
Device 1438 is Meater probe
The Sharptools Bad Gateway error is a new one though..

Hub Memory:

Does this rule out memory as being the culprit?

What is your interval for collecting the memory?

12 minutes (I copied your settings in the hub info driver)

Can you set it to a shorter variable like 5 or 2 or even 1 min. That value is gathered as a instant value and can be gathered much more frequently. Once we get to the bottom of this then i would change it back to 5 or something larger, but having it set lower we could potentially get detail as to how long the memory is low or if it even goes lower.

It may also be good the next time this happens to collect and post the App and devices Stats page from the live logging info page. It may help atleast point us in a direction. All of those errors you are posting point to a networking issues like you are calling out. Are most of your devices wifi devices that use polling to obtain device status and such?

1 Like

Yes

It may also help to see what functions where happening right before those events in your logs occured. Can you can the logs to see what activity was occuring just before those actions maybe that can show a precoursour event

What polling interval is your ecobee suite integration running at?

Also what time do you have the hub backup job running.

New errors (happening a lot now it seems)

I had updated the memory poll to be every 1 minute, and it seems memory never drops very low

Ecobee polling is set to 5 minutes.
Hub backup set to 2:15AM every day

Latest device and app stats


Can you adjust the Ecobee suite to 1 min instead of 5. The developer has indicated setting it lower is actually better with how it works.

On your hub memory graphic can you zoom in to only show the data that is by 1 min intervals.

Those Sharptools errors look like they are being triggered by the remote side. Do you have a firewall. What is your network setup like devices and how they are connected.