Hi, fellow BIND users.

The other day I was attempting to diagnose a problem on a recursive resolving 
name server.  I had just enabled DNSSEC Validation, and certain digs (such as 
"www.isc.org", "www.dnssec-failed.org") were failing.  Even queries to 
non-signed domains such my own personal domain (which also happens to be in 
.org) were failing.

I was testing it with this command line:

  dig +dnssec +bufsize=XXXX www.isc.org. a

Where XXXX is our locally-configured edns-udp-size value.  This DNS lookup took 
a long time to finish (it was timing out), and then eventually failed with a 
timeout error.  I tried to see where it was failing with:

  dig +dnssec +bufsize=XXXX @127.0.0.1 www.isc.org. A +trace

I got this output before it hung:

=========================
 
; <<>> DiG 9.8.1-P1 <<>> +dnssec +bufsize=XXXX @127.0.0.1 www.isc.org. A +trace
; (1 server found)
;; global options: +cmd
.                       506674  IN      NS      a.root-servers.net.
.                       506674  IN      NS      e.root-servers.net.
.                       506674  IN      NS      l.root-servers.net.
.                       506674  IN      NS      m.root-servers.net.
.                       506674  IN      NS      d.root-servers.net.
.                       506674  IN      NS      g.root-servers.net.
.                       506674  IN      NS      i.root-servers.net.
.                       506674  IN      NS      f.root-servers.net.
.                       506674  IN      NS      h.root-servers.net.
.                       506674  IN      NS      c.root-servers.net.
.                       506674  IN      NS      k.root-servers.net.
.                       506674  IN      NS      b.root-servers.net.
.                       506674  IN      NS      j.root-servers.net.
.                       506674  IN      RRSIG   NS 8 0 518400 20120313000000 
20120305230000 51201 . kBn5abbR2172kIhOfAdf38Mi4IpqkclowMxD2BKh2hg3udwGeJfK3YOA 
I1Pz9lcb/NzFzh+ndVXZERaofryyoeE15ZD0HQxMqLai7HV6nVKQyiPZ 
vGXA3CsIua9g8dnnN4RNbYrPnM7i6f/hBgKph8/AcFHXAQfRFZIxiJL1 O50=
;; Received 397 bytes from 127.0.0.1#53(127.0.0.1) in 817 ms
=========================

I ended up spending an hour or two trying to figure out what was causing it to 
hang, and in the end, this query was hanging:

  dig +dnssec +bufsize=XXXX @a0.org.afilias-nst.info. org. DNSKEY

It turns out that the answer to that query is larger than our "bufsize", so the 
packet came back truncated, and BIND was re-trying over TCP, but our ACLs 
weren't set up right to allow that.

My request is this:

Please add something to "dig" that replicates the behavior of BIND as closely 
as possible with regards to the many queries it issues as part of a 
DNSSEC-validing resolution.

I ran tcpdump on an unloaded server and captured all DNS query traffic 
immediately after running "rndc flush", and the queries it asked, in order, 
were:

          Remote server                    Domain    Type
======================= ========================= =======
     M.ROOT-SERVERS.NET    www.dnssec-failed.org.       A
     M.ROOT-SERVERS.NET                         .      NS
 d0.org.afilias-nst.org    www.dnssec-failed.org.       A
     dns105.comcast.net    www.dnssec-failed.org.       A
     f.root-servers.net                         .  DNSKEY
     dns104.comcast.net        dnssec-failed.org.  DNSKEY
a2.org.afilias-nst.info        dnssec-failed.org.      DS
 b0.org.afilias-nst.org                      org.  DNSKEY
     k.root-servers.net                      org.      DS
     dns101.comcast.net        dnssec-failed.org.  DNSKEY
     dns105.comcast.net        dnssec-failed.org.  DNSKEY
     dns103.comcast.net        dnssec-failed.org.  DNSKEY
     dns102.comcast.net        dnssec-failed.org.  DNSKEY
 c0.org.afilias-nst.org       dns104.comcast.org.    AAAA
     dns101.comcast.net       dns104.comcast.org.    AAAA

(The edns-udp-size for this server is 4096.) I realize that some of this 
traffic might be unusual (such as the query for "dig @M.ROOT-SERVERS.NET . 
NS"), but the rest of it is normal DNSSEC resolution.

It would be *extremely helpful* if dig printed out the queries it was doing as 
it was doing them, so I could have seen that it was re-trying a truncated 
response, and hanging on TCP.

This doesn't even show the fallback-to-TCP that might happen if the 
edns-udp-size was lower, like in my other location.

I understand I'm asking dig to do what BIND normally does, but since they're 
both packaged together, it seems like a reasonable request.

Does anyone else see a need for a tool like this?
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to