Hub LAN offline for up to 30min post reboot

So this just recently started happening, and I dont believe it's related to any recent update. On a reboot, the hub looks like it's non-responsive (no pings, no web site loaded, etc). But randomly, when monitoring via ping, it showed up. At first, I thought the hub was stuck in the boot process. However, upon reviewing the logs, the hub seems like it rebooted fine (didnt do any timing yet), but the LAN was not connected. I can see logs for other devices start after the boot, but the network based apps, like Ecobee, show "Network is Unreachable" in the logs.

Any way I can see why the LAN/NIC was not UP shortly after bootup? I cant seem to find related logs.

Not that Iโ€™m aware. What youโ€™re describing is unusual. Tagging @gopher.ny

Here's one theory...

Is it possible your hub rebooted and could not find a DHCP server so assigned itself an APIPA address, but you have DHCP retries turned on and the hub eventually got an IP address and came online? I don't know if this would show up in the logs or not but it is one scenario that might explain what you're seeing,.

If this is the case there are a couple of things to check, but the first thing is whether your DHCP server was up and running and available before the hub started up.

4 Likes

This. It happens to me anytime the entire power blips, router takes longer to boot than the hub. Interestingly, local LAN control for LIFX seems to still work even in this state - I guess some combination of only needing L2/MAC-based protocols, not L3/IP-switching.

My C4, not having the DHCP retries feature, never comes back (probably would if I left it for 24 hours to try a new lease?, but I get impatient :upside_down_face:). On my backlog to write an App to monitor/reboot gracefully on the older hubs

2 Likes

I was going to check my switch later to see if it still shows connected during a reboot, and if so, try a packet capture.

Yeah that would certainly yield some interesting data. I suppose you could also have a bad ethernet port in the HE, though I'd think that would be pretty rare.

You might try doing a ping test from HE to your router and log failures. Who knows... maybe it's losing connection at other times and you just haven't noticed it.

Please try Settings - Network setup - DHCP auto reconnect. 30 or 60 seconds selection should be frequent enough.
image

So here's the result of my testing:

  1. I dont believe there is a link failure at any point. I've checked my switch and not seeing any failures.
  2. When set to DHCP, today on a couple restarts, the system came back up within minutes. However, on checking my dhcp service logs, we're definitely seeing dhcp related traffic between the service and hubitat. This happens pretty quickly after I click reboot. But I dont see returned pings for a couple min.
  3. When set to static IP, it took nearly 30 min to come online.

My guess - could be a hardware issue in assigning the IP address to the nic. My DHCP server offers up the proper IP address rather quickly. And on static, no reason it should take time to come online. I'd like to validate if I had a hardware issue by having someone check system logs.

Definitely worth checking for a HW issue but to me this sounds more like a network stack thing. There are other threads that describe unrelated challenges with static IPs - maybe the answer is just to use a DHCP reservation and be done with it?

Have you made any (recent) changes to the physical layer (changing or moving cables, switches, routers, switch ports, etc.)?
Have you tried using a different ethernet cable?

Iโ€™ve been using dhcp reservations for years. I only tried static as a troubleshooting step. Could be network stack but I do see dhcp packets back and forth at boot up.

Once the system is online, Iโ€™m good. I could just leave it, but I think itโ€™s days may be numbered.

No network changes (but a major one will be done in the coming days - pfsense โ€”> opnsense). I could test after again after that. Once the system is online I no longer see network related issues.