Networking bafflement

Volume 3, Issue 14; 27 Apr 2019

When I said I was having a weird networking issue, I didn’t know the half of it.

This post has been replaced.

A couple of weeks ago, more or less, I mentioned a weird networking issue: if my laptop was plugged into the network, everything was fine. If I disconnected the ethernet cable, everything stopped. And I don’t just mean for my laptop, which has a perfectly functional wireless connection to the router, I mean every device on the network is unable to reach the internet. If I logged into the router, I couldn’t even ping or (but I could, from the router, if my laptop was wired).

Here’s what my network looks like:

 Network configuration

Random chats with several folks lead me to the conclusion that my router had become flaky. It was a several-years-old Netgear R7000 running DD-WRT. Aside from the fact that in my experience residential networking gear has a half life of about five years, I had no real evidence to support my conclusion. But I didn’t have any other guesses, either.

Everything I know about networking I learned via web searching and experimentation. I am not well equipped to determine what’s really going on here. And I’m busy, this is not a fun challenge right now.

I fretted over the frustratingly opaque and jargon riddled pages at the DD-WRT and OpenWrt sites for a couple of days and ordered myself a Netgear R7800. Onto which I installed OpenWrt (because James ☺).

And…it made no difference.

Except I can ping numeric IP addresses now. (I haven’t tried to, and don’t have the time or energy to try to, reproduce the configuration where I couldn’t even ping IP addresses.) However, I can’t ping my DNS server, really cannot recommend running Pi-hole strongly enough. Whenever I have it turned off, I see ads in all kinds of places I don’t expect. The beauty of using Pi-hole is that it blocks ads for every device on the network. I had no idea, for example, that the Roku channels page has advertisements! No DNS certainly explains why the problem effects every device on my network!

I run Pi-hole on a RaspberryPi and all my DNS goes through that. I turned that off and sure enough, the problem went away. I’m not running my network without Pi-hole so I did a little dance where I unplugged my laptop, waited for it to fail, plugged my laptop back in and looked at dmesg on the Pi. And there it was:

The CPU reports a problem with the eth0 interface:

WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x294/0x298
NETDEV WATCHDOG: eth0 (smsc95xx): transmit queue 0 timed out
Modules linked in: sha256_generic cfg80211 rfkill snd_bcm2835(C) snd_pcm …
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G         C        4.19.36-v7+ #1213
Hardware name: BCM2835
[<80111eac>] (unwind_backtrace) from [<8010d430>] (show_stack+0x20/0x24)
[<8010d430>] (show_stack) from [<8080e560>] (dump_stack+0xd4/0x118)

I have absolutely no explanation for why eth0 on the Pi should crap out if and only if my laptop is disconnected. And it is my laptop; I put another laptop on the wired network and it still failed when I disconnected mine. I can’t concoct any explanation, no matter how far fetched, to explain this behavior.

I’ve been running this Pi-hole configuration for at least a couple of years. It has been flawless. And “I haven’t changed anything.” A bit of web searching lead me to the suggestion that upgrading the Pi firmware might help. It didn’t. Maybe, I concluded, the Pi has just gone flaky. I’ve got another one hooked up to the TV, so I swapped them.

And…exactly the same thing happened.

I was briefly under the misapprehension that the problem went away if I plugged the Pi directly into the router, instead of into the switch. It didn’t.

I put the original Pi back and simply connected it wirelessly. Relying on a wirless connection for DNS seems…suboptimal, but I’ve already burned more than all of the time and energy I have for this. Presumably this will fix the problem; time will tell.

It has occurred to me since that I may have run the ‘apt-get’ upgrade dance on the first Pi, and the process of installing Pi-hole on the second one may also have occasioned the upgrade dance. Maybe this is just a bug in the current Pi distribution. I dunno. I’m giving up for now (assuming the switch to a wireless connection fixes the problem!)

In theory, I can run Pi-hole on the NAS. I’ve put that on my todo list. Near the top of page 3,431 so not something I’m likely to get to in the near term.