When good recursives go bad: why does 1.1.1.1 not give the same answer as 8.8.8.8?
Anyone got a suggestion for this? Why would Cloudflare DNS be unable to resolve the SOA record of a given domain if Google DNS can? 9.9.9.9 can also resolve this without trouble. Its not a one-of either, it repeatedly refuses to answer, but does for other domains. Hmmm. Is this a disconnect between Cloudflare and .ca root? A caching issue in Cloudflare? Hmm. If we “$ dig @d.root-servers.net -t SOA 100years.ca” and then check each of the roots (c.ca-servers.ca etc), they all work from my house… Hmm.
$ dig @1.1.1.1 -t soa 100years.ca ... ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 63615 ... $ dig @8.8.8.8 -t soa 100years.ca ... ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 196 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ;; QUESTION SECTION: ;100years.ca. IN SOA ;; ANSWER SECTION: 100years.ca. 10799 IN SOA ns1.fastpark.net. hostmaster.100years.ca. 1545692000 28800 7200 604800 86400
C:\TEMP>nslookup 100years.ca
Server: google-public-dns-a.google.com
Address: 8.8.8.8
Non-authoritative answer:
Name: 100years.ca
Address: 185.53.178.7
C:\TEMP>nslookup 100years.ca 1.1.1.1
Server: one.one.one.one
Address: 1.1.1.1
DNS request timed out.
timeout was 2 seconds.
DNS request timed out.
timeout was 2 seconds.
*** Request to one.one.one.one timed-out
Oddly enough, it looks like it came alive while I was trying to test it out. I tried to check other regions to see if I accidentally populated a cache entry, but the other regions appear to be working as well.
I see a couple of possibilities:
1. Someone just managed to find and fix the error
2. ns1/ns2.fastpark.net have a sort of IDS/IPS that may have blocked cloudflares DNS, or other configuration that was specific to cloudflare (like a zones setup). It looks like fastpark is hosted on ec2 though.
3. There is a subtle bug in the cloudflare DNS resolver, when no glue record is returned and the record isn’t cached. When querying 100years.ca, the .ca nameserver returns ns1/ns2.fastpark.net. as the authority, without any additional (glue records) because .net is cross domain from .ca, which should cause the cloudflare resolver to try and find/load ns1/ns2 from .net, but may not have. I chased a problem similar to this in GSMA DNS several years ago, but GSMA DNS is very fragile and don’t believe this cenario is very likely.
4. ns1/ns2.fastpark are somewhat slow to cloudflare. I got a sort of interesting result when I purged the cloudflare cache, where it took 3 seconds to refresh. It’s possible that quite often this times out for some reason, but just got lucky and it’s currently resolving under the timeout.
Anyways, I would suspect cloudflare/fastpark connectivity or configuration / ahead of a problem with cloudflare/ca, but since it seems to be working it’s somewhere difficult to prove.