On 08/07/2013 01:53 AM, Phil Mayers wrote: > On 08/07/2013 12:09 AM, Grant Keller wrote: >> Hello, >> >> We have 7 recursive DNS servers running Bind 9.9.2, and we are seeing >> some strange behavoir validating DNSSEC. We have seen this happen a few >> times, and in the past the problem has gone away when the server is >> rebooted, so my first guess is that some record is stuck in the cache. > > "Rebooted" is a bit extreme; did you actually reboot the OS, or do you > mean "restart bind"? When the problem occurs, have you tried "rndc > flush" to see if that corrects it? > > Are you using any forwarders, or might your upstream be doing > transparent DNS caching? Unlikely, but not unheard of. I should have been more clear, the server was rebooted for a kernel update. Given that, I think that restarting bind would fix the problem, I just didn't want to do that unless I have to.
>> # dig a zygo.com @127.0.0.1 +nocomments > > +nocomments has hidden the rcode (NODATA, SERVFAIL, etc.). So, not > entirely helpful here. > > http://dnsviz.net/d/zygo.com/dnssec/ > > ...suggests there might be an oddity with the TTL on the TXT records > at zone apex, but not the A record. Otherwise zone looks ok. > > You could try: > > rndc dumpdb -cache > I ran a cache dump on both a working server and a non working one, but I am not sure what to make of the results. On the server that is not validating, the section of the cache looks like this: ftp://ftp.sonic.net/pub/users/gkeller/cache_insecure.txt The "pending answer" part strange, I don't recall seeing that before. The "good" server has these all marked secure. >> ; <<>> DiG 9.7.0-P2-RedHat-9.7.0-17.P2.el5_9.2 <<>> a zygo.com >> @127.0.0.1 +nocomments >> ;; global options: +cmd >> ;zygo.com. IN A >> ;; Query time: 162 msec >> ;; SERVER: 127.0.0.1#53(127.0.0.1) >> ;; WHEN: Tue Aug 6 16:06:10 2013 >> ;; MSG SIZE rcvd: 26 >> >> # dig rrsig zygo.com @127.0.0.1 +nocomments >> > > Hmm. This *is* odd. We're on bind 9.9.3 and it seems "dig domain.com > rrsig" always returns TTL=0. > > I wonder if this is new? I don't recall seeing it before. > > In any event, as Mark has suggested, you don't want to dig the RRSIG > yourself. Rather, use: > > dig +dnssec zygo.com a > > ...and if you get a SERVFAIL: > > dig +dnssec +cd zygo.com a dig +dnssec +cd zygo.com a resolved the domain. I have started to get other reports of domains with the same problem. The same nameservers are having validation issues with these, and all the domains use pdns01.domaincontrol.com and pdns02.domaincontrol.com. as auth name servers. I guess this points to a problem somewhere in the trust chain, butI can't figure out where. # dig a zygo.com +sigchase +trusted-key=root.keys +multiline +qr ; <<>> DiG 9.7.0-P2-RedHat-9.7.0-17.P2.el5_9.2 <<>> a zygo.com +sigchase +trusted-key=root.keys +multiline +qr ;; global options: +cmd ;; Sending: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21316 ;; flags: rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags: do; udp: 4096 ;; QUESTION SECTION: ;zygo.com. IN A ;; NO ANSWERS: no more We want to prove the non-existence of a type of rdata 1 or of the zone: ;; nothing in authority section : impossible to validate the non-existence : FAILED ;; Impossible to verify the Non-existence, the NSEC RRset can't be validated: FAILED If I add +topdown then it succeeds. -- Grant Keller Sonic.net System Operations _______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users