Can device errors interrupt household internet access? If so, how to mitigate?

Hi there, we have a few devices that are constantly logged in online. If the internet connection gets interrupted, they get logged out, and we have to go log them all back in manually with their security tokens. We just fixed a long-standing issue with these disconnects a few months ago (it appeared to be related to the ethernet in the walls), so it's disheartening to see them come back with the only real change being the removal of the Wink and the addition of the Hubitat.

I've been running the Logs in a browser tab to see if there are any errors around the time the devices disconnect.

In one case, I suspect it might have had to do with my PowerView hub being randomly inaccessible. Network seemed to stabilize a bit after I removed the app and devices, but not exactly sure if that was the issue.

In another case, I installed a PetSafe app and device, and it threw a java UnknownHostException error while trying to refreshDevices.

The Hubitat is plugged directly into our primary router in the middle of the house. We have one other router working as a wifi mesh node in another part of the house.

Can someone who knows more about network management help me understand why and how a device's internet connection can be interrupted by another device on the same network? I know that's kind of a broad question, so links to beginner IT concepts would be welcome lol. I've tried in the past to figure out how to run logs on the router itself, or from my laptop, to see exactly what's happening to the network when this happens but I can't make heads or tails of them and don't even know if I was logging the right things.

Unless you have apps that are using the network the hub only uses it for time sync or when you access it - you can actually unplug it from the router and it will still control all of the zigbee and zwave devices.

As far as interrupting, a few of ways that come to mind immediately:

  1. if two devices have the same IP address
  2. network storm - device experiences an error condition that causes it to send out a rapid and large amount of broadcast messages effectively clogging the router
  3. a device with high priority starts a large upload or download (requires that the router supports traffic prioritization) and consumes a large portion of the available bandwidth
2 Likes

You could split your network into two, isolate your IoT devices onto one network, your computers on the other.

2 Likes

Thanks a bunch for being willing to go super basic for me haha. Thankfully we usually avoid #1 and #3 -- we haven't assigned any static IPs (I was going to for the Hubitat but was advised against it), and I usually throttle my FTP bandwidth if I'm working and have to upload/download a lot. Which I haven't the last few days.

I'll have to look more closely at the logs for #2. When it was trying to access the PowerView and failing, it did send out a flurry of warnings, but it looked like it capped it at about a dozen. Maybe I need to enable debugging on more devices for a while.

Yeah, that's kind of our last resort I think, haha. Well, second to last; last being signing up for a second internet connection with our ISP.

Could also look at your router logs to see if you’re getting a lot of Denial of Service (DOS) attacks that run an extended time from the outside, and check your ISP for any outage or service degradations that correspond - sometime the connection is there but the name resolver service (DNS) could be unavailable which means that unless you have the internet IP address for where you’re going you’re stuck (sort of like your GPS going out in an unfamilar city - the road is there but finding your way is challenging).

Not sure where you saw that, but it IS definitely a good idea to have all of your IOT devices, like Alexa and Hubitat, on a static address.

2 Likes

Well... regardless, the chance of an IP clash is pretty limited, so if anything it's good for troubleshooting!

It isn't so much of an IP clash we are worried about, it is more if things "move around". If you are trying to access the hub at 192.168.0.102, and it isn't there anymore, that could be annoying. You have to search for the hub and input the new address.

But if you add something like a Lutron Bridge, or Alexa into the mix where they actively NEED to know where the hub is, then chaos ensues when they can't find the hub.

I don't suppose any of this means anything to you? This is just the Hubitat and Router logs from 4:19pm today. Obviously, having to suppress 10k messages 10 or more times a minute is a problem, how can I tell what's causing it?

Hubitat:

app:152 2021-02-03 04:19:20.102 pm error java.net.UnknownHostException: api.ps-smartfeed.cloud.petsafe.net: Temporary failure in name resolution on line 399 (refreshDevices)

Router:

Feb  3 16:19:01 kernel: net_ratelimit: 10661 callbacks suppressed
Feb  3 16:19:04 WLCEVENTD: eth6: Assoc F8:36:9B:83:62:8F
Feb  3 16:19:06 kernel: net_ratelimit: 10719 callbacks suppressed
Feb  3 16:19:11 kernel: net_ratelimit: 10823 callbacks suppressed
Feb  3 16:19:16 kernel: net_ratelimit: 10877 callbacks suppressed
Feb  3 16:19:21 kernel: net_ratelimit: 10795 callbacks suppressed
Feb  3 16:19:26 kernel: net_ratelimit: 10618 callbacks suppressed
Feb  3 16:19:31 kernel: net_ratelimit: 10707 callbacks suppressed
Feb  3 16:19:36 kernel: net_ratelimit: 10585 callbacks suppressed
Feb  3 16:19:41 kernel: net_ratelimit: 10405 callbacks suppressed
Feb  3 16:19:46 kernel: net_ratelimit: 10721 callbacks suppressed
Feb  3 16:19:52 kernel: net_ratelimit: 10518 callbacks suppressed
Feb  3 16:19:52 WLCEVENTD: eth6: Assoc 2C:FD:A1:62:9C:E9
Feb  3 16:19:52 WLCEVENTD: eth8: Assoc 2C:FD:A1:62:9C:EC
Feb  3 16:19:52 kernel: wfd_registerdevice Successfully registered dev wds0.0.12 ifidx 2 wfd_idx 0
Feb  3 16:19:52 kernel: Register interface [wds0.0.12]  MAC: 60:45:cb:d0:a6:30
Feb  3 16:19:53 kernel: wfd_registerdevice Successfully registered dev wds2.0.6 ifidx 3 wfd_idx 2
Feb  3 16:19:53 kernel: Register interface [wds2.0.6]  MAC: 60:45:cb:d0:a6:38
Feb  3 16:19:54 WLCEVENTD: eth6: Disassoc 2C:FD:A1:62:9C:E9
Feb  3 16:19:54 kernel: wfd_unregisterdevice Successfully unregistered ifidx 2 wfd_idx 0
Feb  3 16:19:56 kernel: wfd_unregisterdevice Successfully unregistered ifidx 3 wfd_idx 2
Feb  3 16:19:56 WLCEVENTD: eth8: Disassoc 2C:FD:A1:62:9C:EC
Feb  3 16:19:56 WLCEVENTD: eth6: Assoc 2C:FD:A1:62:9C:E9
Feb  3 16:19:56 kernel: wfd_registerdevice Successfully registered dev wds0.0.12 ifidx 2 wfd_idx 0
Feb  3 16:19:56 kernel: Register interface [wds0.0.12]  MAC: 60:45:cb:d0:a6:30
Feb  3 16:19:57 kernel: net_ratelimit: 10435 callbacks suppressed

Edit: I mean, obviously it makes sense that the SmartFeed error could have triggered it. But why would that throw so many errors? Shouldn't it just throw a few?

ASUS router?

IIRC the net_ratelimit is saying that something was trying to write to your log at too high a rate and it skipped ~10000+ records at a time.

Yeah, ASUS router. Doing some Googling, it's possible to put some setting lower so it won't throw so many records. Worth a try!

I'm noticing that there's a certain MAC address that throws a bajillion errors every time it's "associated." It's not the Hubitat, though, and it's not currently connected anywhere so I'm not entirely sure what it is.

What’s the MAC address? You can look it up at https://uic.win/en/mac/

Oh my gosh what a cool tool!!!

Something Intel apparently...

1C:1B:B5
Intel Corporate
Lot 8, Jalan Hi-Tech 2/3
Kulim Kedah 09000
MY

Another offender per the logs is

E0:B9:4D
SHENZHEN BILIAN ELECTRONIC CO.,LTD
NO.268, Fuqian Rd, Jutang community, Guanlan Town, Longhua New district
shenzhen guangdong 518000
CN

Smartphone or TV would be my guess...

We actually were having problems with our TV that seemed to resolve when we unplugged it and forced it to go over WiFi. Odd.

Second one may be a wifi extender....

1 Like

Hmm, maybe I need to look at our other Asus routers that we were using with their mesh system. Thanks for the hints.

Edit: Narrowing things down. Two MAC addresses appear to be associated in some way with the PetSafe Smart Feeder (that's the Shenzhen device). It threw a load of errors even opening the app on my phone. Completely reset the feeder and reconnected it to Wifi, haven't seen the same thing since. Also, noticed that the router firmware hadn't been updated in a while, and it was acting a little funny... manually updated to the latest stable release (which was last week incidentally) and errors within Hubitat don't seem to affect the router logs at all. So, long story short, might have just been a router problem. So TL;DR I think the "storm" concept is what was happening, thank you for showing me where and how to troubleshoot!

1 Like

I think I disagree on the use of the network by Hubitat. While it may not be a factor in the OP's issue, the hub logs all events to the local net. I say this because when I use node Red to capture events they all come through the router. I believe the events are published whether something is listening or not, but I'm not positive.

The only way Node-Red would see events is if you have MakerAPI running and are subscribing to them through it, If you’re not running an external application, i.e. Node-Red, that is requesting services there is no traffic; even if you have MakerAPI installed, until you make a request, one time or subscription, it just sits there idle.