The mystery of the home DNS blackhole
So for a while i’ve been experiencing a mysterious issue with the DNS that runs casa del don. Specifically, android devices with 7.1 or later would work, and then not work, for resolving e.g. the homeassistant, plex, etc. They could ping the IP, but got not DNS resolution. If i ran dig or ‘host’ on them, it worked. WTF?
Well, tool #1, tcpdump. We got that cranked up on the main router which also runs the DNS. And, mysteriously, its using the link-local IPv6 address. Not illegal, just odd:
IP6 fe80::d233:a842:6f0:3dde.54331 > fe80::f0b4:29ff:fed9:9e98.53: 62928+ A? ha.donbowman.ca
hmm. Now, I should have stopped and looked here longer, but didn’t. The answer lies not in the link-local, nor the IPv6. Instead, it lies in *which* link-local address it used. Its actually one of the other access points in the house!
So, how did this happen? And what is happening?
Well, I swapped out the Asus RT-AC66U that ran the bedroom for a Xiaomi Mi Mini. And of course, put Lede on it (the swap out was done since the Asus is kind of a pain to use, its proprietary 5G radio isn’t supported on FOSS, and without FOSS the vlans don’t work…. And I wanted the upstairs IoT devices to enjoy good radio but on a non-default VLAN). So after the swap, all worked. But, the Mi Mini was running dnsmasq (despite not being a DHCP server). And, the Android devices were finding it, and using it. But, it thinks its authoritative for my domain. So the Android devices would do a DNS lookup, on IPv6, and get a NODATA-V6 (NXDOMAIN), and give up. They were asking the wrong server!
And me, not being observant enough, spend a lot more time investigating this than I should have. Its hard to spot the link-local address difference.