Thanks for the information Cathy. I've always run the Red Hat provided packages in the past, this is the first time I've ever tried running the newest release direct. Mostly I'm just feeling extra cautious since this is something I've never done before and admittedly I don't know as much about DNS as I should so I really appreciate you taking the time to break down what is happening.
Based on your explanation it sounds like this isn't something I'll ever run into other than this one special case so I'll stop worrying about it. Thank you! -Nick -----Original Message----- From: bind-users [mailto:bind-users-boun...@lists.isc.org] On Behalf Of Cathy Almond Sent: Tuesday, February 27, 2018 4:29 AM To: bind-users@lists.isc.org Subject: Re: Issue running "dig txt rs.dns-oarc.net" on 9.12 On 22/02/2018 16:44, NNEX Support wrote: > I'm sorry to keep replying to myself but I believe I've found the line of > code that is causing this issue. Looking at validator.c, in the > check_deadlock function, 9.12.0rc1 says: > > ... > > if (parent->event != NULL && > parent->event->type == type && > dns_name_equal(parent->event->name, name) && > > ... > > 9.12.0rc3 and above says: > > ... > > if (parent->event != NULL && > (parent->event->type == type || > parent->event->type == dns_rdatatype_cname) && > dns_name_equal(parent->event->name, name) && > > ... > > By removing "parent->event->type == dns_rdatatype_cname)" (and adjusting the > rest of the if statement appropriately) the query "dig ns rs.dns-oarc.net" > works. > > I see this commit related to this line of code: > https://gitlab.isc.org/isc-projects/bind9/commit/2b51d5874c49ac823890b > 88824290fbf1c18f2cc > > I'm sure this line of code is important, otherwise it wouldn't be there and I > don't know enough to be removing random bits of code, so of course I'd never > run this in production. Still I want to understand why this is happening and > if it’s a bug or me not understanding DNS properly. Good sleuthing - though apart from understanding why the query now fails, I don't think there's any code defect that needs to be addressed. This line of code belongs with these changes between RC1 and RC3. They are kinda important (note the CVE reference): 4859. [bug] A loop was possible when attempting to validate unsigned CNAME responses from secure zones; this caused a delay in returning SERVFAIL and also increased the chances of encountering CVE-2017-3145. [RT #46839] 4858. [security] Addresses could be referenced after being freed in resolver.c, causing an assertion failure. (CVE-2017-3145) [RT #46839] The debug log you pointed to was also specific about why the validation stopped: validating rs.dns-oarc.net/CNAME: checking existence of DS at 'rs.dns-oarc.net' validating rs.dns-oarc.net/CNAME: continuing validation would lead to deadlock: aborting validation validating rs.dns-oarc.net/CNAME: deadlock found (create_fetch) The rs.dns-oarc.net zone is broken because it returns a CNAME for queries at the apex. Observe the delegation (I'm querying one of the servers auth for dns-oarc.net): ; <<>> DiG 9.11.2 <<>> +norec +dnssec @64.191.0.65 rs.dns-oarc.net NS ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 43571 ;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 3, ADDITIONAL: 3 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags: do; udp: 4096 ; COOKIE: 47d4eddbbbde6fd18616a25b5a952d35767788ad0b03038f (good) ;; QUESTION SECTION: ;rs.dns-oarc.net. IN NS ;; AUTHORITY SECTION: rs.dns-oarc.net. 3600 IN NS ns00.rs.dns-oarc.net. rs.dns-oarc.net. 3600 IN NSEC rs4.dns-oarc.net. NS RRSIG NSEC rs.dns-oarc.net. 3600 IN RRSIG NSEC 8 3 3600 20180328101103 20180226091103 12093 dns-oarc.net. floDmByYaxmh+QQWou7PtICj4tnpW6/ea1WzatUfAEMvPOSmm54CJ467 KWpnf5XADFgFrcHOr0gYLlbFVJrwEB5n6R+SvXOTx9zwgva3SY37Vgq8 ZMwdNPdGxmVLOz1Ou5tByfZV2ZLpueF+hBB12wft+wNCysjMuwtx4U2D a64= ;; ADDITIONAL SECTION: ns00.rs.dns-oarc.net. 3600 IN A 64.191.0.133 ns00.rs.dns-oarc.net. 3600 IN AAAA 2620:ff:c000:0:2::133 Then look at the query response for a DS RRset that the BIND validator is receiving from ns00.rs.dns-oarc.net: ; <<>> DiG 9.11.2 <<>> +norec +dnssec @64.191.0.133 rs.dns-oarc.net DS ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 61119 ;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 27, ADDITIONAL: 28 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;rs.dns-oarc.net. IN DS ;; ANSWER SECTION: rs.dns-oarc.net. 60 IN CNAME rst.x1013.rs.dns-oarc.net. ;; AUTHORITY SECTION: x1013.rs.dns-oarc.net. 60 IN NS ns00.x1013.rs.dns-oarc.net. x1013.rs.dns-oarc.net. 60 IN NS ns01.x1013.rs.dns-oarc.net. x1013.rs.dns-oarc.net. 60 IN NS ns02.x1013.rs.dns-oarc.net. --- snip (lots of NS RRs) --- This is a CNAME at the apex of the delegated zone - I can't get NS or SOA RRs either, and that's what the updated validator is unhappy about. Prior to the changes to stop the potential validation loop (which probably wasn't going to be a loop in this specific instance, but BIND didn't know that), clients using validating BIND to send a reply-size-test query would have 'got away with it' But no longer. But since the reply-size tester doesn't work any more anyway with modern BIND, does this matter? Cathy _______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users _______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users