C-8 Pro - local DNS fails after DHCP lease renews

I'm having a weird DNS issue on my C-8 Pro (2.3.8.118) that I've been using for a bit less than 24 hours.

Early this morning, and again this afternoon, I had errors in the logs showing that apps were unable to connect to devices on my LAN. I also had automations fail where the rule contained a http request to a local URL.

I use DNS names for all my local devices, as it makes things much easier if/when I move services between different machines. My router's DHCP server provides two LAN IPs as the DNS servers. Each of those IPs is a Pi4 running PiHole DNS. My home network uses the "home.lan" domain, so the local DNS entries are for whatever.home.lan. While I do run subnets for other things, the hub and DNS servers are all on the main LAN subnet.

The first time this happened today, my Powerwall Manager app was unable to connect to the Powerwall gatway at powerwall.home.lan. I used the hub's Network Test tool to try to ping the gateway and it failed with "ping: bad address 'powerwall.home.lan'". I could ping anything on the LAN by IP, but not by xxxx.home.lan DNS name.

I entered the two DNS server IPs in the DNS override on the hub in 'Settings -> Network Setup -> , Override DNS settings' and then rebooted the hub. Things worked normally for a couple of hours before the hub again couldn't connect to local devices by DNS name.

The Network Test tool again showed that I could ping devices by IP but not by DNS name. I had no issues pinging those devices by name from other devices on my LAN.

Just to be sure nothing weird was going on with my router, I rebooted the router first but that didn't resolve the issue. I then rebooted the hub and things started working again.

The hub is using DHCP for IP address assignment, and given that it initially worked with no DNS override setting, it seems it is getting the correct DNS servers when it initially boots but somehow can't use those later whether the servers were provided by the DHCP server or override.

For the time being I'm going to create a rule that pings a LAN device by name every minute and either notifies me or just reboots the hub (in case it happens overnight) when the ping fails, but that's an ugly, disruptive workaround. Edit: it looks like the ping command in Rule Machine can only ping by IP, not name, so I'll have to find a way to trigger a notification based on the errors being logged.

Happened again at about 2130 this evening. I just switched to static IP with the gateway and DNS set in that screen and rebooted. Will see how long it lasts this time.

@bravenel @bobbyD Any idea what could be going on? The only thing I can think is something is resetting the DNS to some default setting, but I'm not sure that makes any sense. Public DNS (ping Google.com) still works when the private addresses stop resolving.

Setting a reservation like you did is the best way.
It should remain fine. If you are setting a static ip, make sure your dns entries are seperated by comma's. Even if the entries show pre loaded in the dns space, type them out again and click done or save. Now as for resolving dns on your local network are you running a local DNS server or are you relying on mDNS?

I am running two redundant DNS servers. I don't think the PiHole servers use mDNS. The DNS entries are separated by a comma and a space, as per the example given in the UI.

Again, the odd thing is that local name resolution works for a while, then it stops working. I have a lot of other services running on other devices that rely on local name resolution, and none of them are having issues, so I don't think the problem is with the DNS servers themselves. My C-7 ran the same apps and rules for a very long time with no DNS issues.

So far it hasn't failed with the static IP and DNS set, but I'll need more time to see if it is going to happen again. This timeline shows when the Powerwall manager app was throwing warnings about not being able to reach the local device. The end of each yellow line is when I rebooted the hub. I upgraded to the C-8 Pro around 1700 hours on 2/9, so the span since last night's reboot is the about as long as it's run so far without failing. If it makes it to tomorrow without any issue I'll try to set it back to DHCP and see what happens.

I have seen this same issue since upgrading. Temporarily works and then reverts to not resolving endpoints by hostname.

Mine ran ok for over 24 hours when set to static IP. I set it back to DHCP about 5 hours ago and am waiting to see if it happens again. If it does I'll go back to static IP indefinitely.

Failed again. Going back to static IP.

I upgraded around the left edge of this chart. The yellow lines indicate warnings thrown by my Powerwall Manager app when it couldn't resolve the local DNS name.

The gap annotated in red is when I was using static IP settings. You can see it lost local name resolution within a few hours of going back to DHCP.

I just had a :bulb: moment. My new-ish router, running OpenWRT, defaults to a 12 hour DHCP lease. The issue seemed to arise about 12 hours after a reboot.

I'm was thinking that something goes haywire when the initial lease expires. I don't think the issue is (exclusively) with the router since my C-7 didn't have this issue and none of my other LAN devices have this issue.

I can override the lease time for a specific client, so I set the hub's lease on the router to 5 minutes, and set the hub back to DHCP. Just for good measure I set the DNS override option on the hub so it should use my local DNS no matter what the DHCP server says.

Bingo! This issue is now reproducible on demand:

Immediately after the reboot:
image

5 minutes (plus a few seconds) after the reboot:

image

@bravenel or @bobbyD - Let me know if you need me to PM my hub's ID so you can look at logs. Since at least one other user has noticed this, I'm hoping you can reproduce it as well.

Edit: edited the thread title to describe the issue more clearly.

1 Like

I will have to add an endpoint to dump some debugging information, something to run ip addr show command and such.

1 Like

Could I suggest that you go back to static IP? While we will provide the endpoint @gopher.ny said above, there are many things on our plate. If your issue can be solved directly with static IP, as you said above, that would be best. We can't promise that this is going to be top of our priorities.

2 Likes

Absolutely. I've already switched back on static IP and can keep it that way indefinitely. Hopefully there are enough keywords between the thread title and messages that anyone else having the same issue can find the temporary fix. As long as this is somewhere on your list to look at later, I'm good.

1 Like