[Staff Support Request] Hub Networking Errors

Have you also tried using your normal DNS server for your LAN? By default that is usually the router, unless you have a separate DNS server.

You could try playing around with this DNS benchmark software. GRC'sĀ |Ā DNS Nameserver Performance BenchmarkĀ Ā 

Its either a DNS issue (with LAN or ISP), or the hub is getting so bogged down with outbound IP traffic that it is not processing the requests quick enough.

This could very well be it as I have right many cloud apps/drivers. Any way to drill down into this, such as by logging timestamps of outbound IP requests to a file? If this ends up being the culprit, I guess a second hub to split up the IP traffic would help...

Yes, that was what I had it set on initially, and then switched to this as part of troubleshooting.

I dont think so, would have to be done in each app/driver.

You could look at the device/app stats in the logs area and sort by "Hub Actions". These are often created from IP integrations and high numbers could be suspect bad code causing problems. Also could check the "Hub Actions" tab but that only shows currently pending actions. Tab is explained here: Logs | Hubitat Documentation

I ordered a C8 and will see if splitting up my cloud apps between two hubs solves it.

I put only two apps so far on the C8, Rachio and Ecobee. Both cloud heavy. But I am even getting read timeouts sometimes on the C8, so I donā€™t think itā€™s load related. Or hub related for that matterā€¦

Yeah probably networking related.

There are a bunch of different errors I see in various screenshots. I am not seeing any that looks like DNS issues. Usually you would see something about failed to resolve...

Some of the errors look like the other end is rejecting the connection for some reason. This could be an internet issue or maybe a LAN issue if the connection is getting cut off.

Other errors look like the hub cannot find the destination, which I would assume is a LAN issue and possibly the hub cannot find the proper gateway to reach the internet. Some of the errors even say "bad gateway"

What all you have for networking equipment? Routers, APs, Switches, Firewalls...?
What is your ISP and are you behind CGNat?

I was getting these same errors when I had an RBK50 Netgear Orbi mesh, with 2 satellites. Iā€™ve since replaced that with a RBK852 Netgear Orbi mesh. Still getting the same errors. I have two Legrand DN1008 switches. Probably another switch or two somewhere else.

No standalone firewall other than whatā€™s built in. ISP is Ting fiber (gigabit). Donā€™t think Iā€™m behind CGNat.

Do you think I should try plugging the hub directly into my router? Thatā€™s going to change its location though so that might mess up my zwave network?

Might a good test just to eliminate some variables. Unless it is a very drastic change in location it should not cause too much chaos. Some of the devices might be a little sluggish for a day or two but they will figure it out eventually.

Just got another Sharptools read timeout with the hub connected directly to the router. I'm not optoimistic this was the fix. I am beyond frustrated at this point, especially with little to no formal support from Hubitat. This issue is really undermining my confidence in this platofrm. But I am so invested code-wise into this platform that I want to figure it out.

The problem is several of those errors indicate the problem is on the remote side.

A read timeout generally means a connection was made but the remote system timed out on a application call.

A Bad gateway error can be a bad reverse proxy issue or server error calling internal API's

Connection Refused generally means the device you are trying to talk to isn't listening on that port.

No route to host generally means the remote devices is offline ( or ip had changed as pointed out)

As long as you are not getting java heap memory errors to me this looks like a network or remote host issue. Or possibly some kind of firewall filter settings. That isn't a problem with Hubitat and is likely why you are not getting much engagement from support.

2 Likes

No memory errors thank goodness. Remote host issue would be unusual since so many different remote hosts involved (multiple different apps). I do use a Dynamic DNS service via Netgear (which uses noip.com). Starting to wonder if thatā€™s the culprit. Iā€™ll turn it off to see if that fixes things.

I would absolutely understand if Hubitat staff engaged me substantively in support and reached the conclusion that itā€™s a network issue and said ā€œgood luck with thatā€. But they havenā€™t done that.

How frequently do you get these errors? Are we talking all day long and every day or a few times a day.

I get on a some what regular basis errors with the Ecobee and Govee integrations. It isn't constant, but maybe a few times a day every once in a while. These are not hub or network issues i am having but issues were the remote side is having issues. If you are seeing something similar it may just be the nature of cloud based systems.

HE staff doesn't really need to confirm it if all of the tests they have asked you to run come back clean and you have experienced the same problem on two seperate hubs. Not sure what else they can suggest to you.

The dynamic DNS thing shouldn't be an issue as i understand it as that should be providing name resolution back to you network and nothing more.

Thatā€™s fine if thatā€™s the technical conclusion of the matter from the hubitat side. My point is that thereā€™s been insufficient technical support engagement from a customer service standpoint that has left the matter in a state of being unresolved - as in, I have no idea if they would or have concluded that thereā€™s nothing left to suggest.

It varies. Sometimes will go a few days without any errors. Other times Iā€™m inundated with them all day long.

@JustinL I would suggest setting up an external system to ping a public site continuously so you can compare Hubitat to determine if it is HE or your fiber network. If you have NodeRed you can easily setup a flow to do this and log to a file. Then compare the HE error time stamps to this external system to see if both fail which would indicate an internet issue vs HE issue.

I mention this because I have ATT fiber and they applied a firmware update to their modem that caused all kinds of issues, same happened to a coworker of mine. I had that modem in DMZ mode and now completely bypass it and all is well.

2 Likes

Yeah, looks like this is the next step, because disabling the Dynamic DNS didn't resolve it.

1 Like

I did this. The hub network calls fail even when the continuous pings succeed. That suggests to me that it's a hub issue, not a network issue, no?

C5 Hub errors:

Dev:1641 is Flume
App:1839 is my mower app, and looks to be an unrelated bug.

Continuous Ping Successes:
https://github.com/lnjustin/Hubitat/raw/master/pingLog.txt

Somewhat troubling to me though is that my C8 hub also experienced errors today:


These errors are from the Rachio integration. The null pointer errors appear to be secondary to the read time out errors. The C8 hub ONLY has Rachio and Ecobee.

C5 load stats:


C8 load stats:

Network setup: Netgear Orbi Mesh RBK852.
C5 hub connected directly to router via Ethernet. C8 hub connectived via a Legrand Ethernet switch.

Bottom line: I have repeated cloud call failures occurring on both C5 and C8 hubs around the same time, but continuous pings from my windows server succeed unequivocally during that time period (and beyond). Any ideas? ahem, @mike.maxwell @support_team

Rachio and Ecobee integrations use similar process. If the Ecobee doesn't have a problem, which it doesn't, is less likely to be a hub issue. I saw the same time-outs on Rachio, earlier, which prompted me to check their social media to see how many other non-Hubitat users report issues. This is what they shared:

image

Good to know. And that would satisfy me except it doesnā€™t explain the cloud errors from Sharptools and Flume. Nor does it explain this being a repeated issues occurring for months now. Any thoughts on that?

I am not familiar with Flume integration. As for SharpTools, @josh may have some insight.

Is your assessment then that these are all independent, unrelated errors? That all happen to recur periodically in waves?