Re: Deprecation notice for BIND 9: "resolver-nonbackoff-tries", "resolver-retry-interval"
On 07. 12. 23 1:05, Fred Morris wrote: On Wed, 6 Dec 2023, Evan Hunt wrote: I say go ahead, if nothing else consider it a "scream test". But can you take a moment and tell us which stakeholder group(s) you think you're optimizing for, why, and how? On the technical level we optimize using real (anonymized!) traffic provided to us by operators. Here's what we need: https://kb.isc.org/docs/collecting-client-queries-for-dns-server-testing If you want us to optimize for your use-case let's talk how we can get the data and replicate your setup! -- Petr Špaček Internet Systems Consortium -- Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
RE: dnssec-delegation seems to be broken from .gov to bls.gov
Point taken and understood. But you know how it is when there is major outage the push from upper management is always for "fix it now" and get us up and running do your RCA later. Thanks Sandeep -Original Message- From: Mark Andrews Sent: Wednesday, December 6, 2023 10:19 PM To: Bhangui, Sandeep - BLS CTR Cc: Nick Tait ; bind-users@lists.isc.org Subject: Re: dnssec-delegation seems to be broken from .gov to bls.gov CAUTION: This email originated from outside of BLS. DO NOT click (select) links or open attachments unless you recognize the sender and know the content is safe. Please report suspicious emails through the "Phish Alert Report" button on your email toolbar. More to the point why was the old KSK removed *before* checking that the DS record for the new KSK was published and had been for the TTL of the DS RRset? With proper procedures this should not happen. When something goes wrong / is delayed in a key rollover the process should stall until that step is complete, not proceed blindly ahead. > On 7 Dec 2023, at 07:35, Bhangui, Sandeep - BLS CTR via bind-users > wrote: > > The problem has been resolved. > The automatic KSK rollover on the dotgov.gov did not happen properly and > once we manually updated the DS record with the correct KSK keytags and keys > things were fixed. > All is good now. > Now to see if we can find out as to why the automatic KSK failover on the > dotgov.gov did not happen correctly. > Thanks > Sandeep > From: bind-users On Behalf Of Nick > Tait via bind-users > Sent: Wednesday, December 6, 2023 3:23 PM > To: bind-users@lists.isc.org > Subject: Re: dnssec-delegation seems to be broken from .gov to bls.gov > CAUTION: This email originated from outside of BLS. DO NOT click (select) > links or open attachments unless you recognize the sender and know the > content is safe. Please report suspicious emails through the “Phish Alert > Report” button on your email toolbar. On 7/12/2023 9:05 am, Nick Tait via > bind-users wrote: > I could be wrong, but based on the output above it looks like the current TTL > is 0, which means that doing this should provide immediate relief. > Sorry it looks like the DNS server on the Wi-Fi network I'm connected to has > done something weird with the TTL. > This is what I get when querying one of the "gov." authoritative servers > directly: > $ dig -t ds bls.gov @a.ns.gov +norecurse > > ; <<>> DiG 9.18.18-0ubuntu2-Ubuntu <<>> -t ds bls.gov @a.ns.gov > +norecurse ;; global options: +cmd ;; Got answer: > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 32241 ;; flags: qr > aa; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 > > ;; OPT PSEUDOSECTION: > ; EDNS: version: 0, flags:; udp: 1232 > ;; QUESTION SECTION: > ;bls.gov. IN DS > > ;; ANSWER SECTION: > bls.gov.3600IN DS 50951 8 2 > E6B0A294066904F20A2B8EBA3FA9920F9A1822802977F59D706B30A1 77F7DC0C > > ;; Query time: 16 msec > ;; SERVER: 2001:503:ff40::1#53(a.ns.gov) (UDP) ;; WHEN: Thu Dec 07 > 09:19:24 NZDT 2023 ;; MSG SIZE rcvd: 84 This means when you remove > the DS record, it will take 1 hour to fully take effect (assuming no delay > replicating between authoritative servers). > Nick. > -- > Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe > from this list > > ISC funds the development of this software with paid support subscriptions. > Contact us at https://www.isc.org/contact/ for more information. > > > bind-users mailing list > bind-users@lists.isc.org > https://lists.isc.org/mailman/listinfo/bind-users -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org -- Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Deprecation notice for BIND 9: "resolver-nonbackoff-tries", "resolver-retry-interval"
I welcome birds of a feather. Need to define / refine the problem statement first. On 12/7/23 12:30 AM, Petr Špaček wrote: > On 07. 12. 23 1:05, Fred Morris wrote: >> On Wed, 6 Dec 2023, Evan Hunt wrote: >> I say go ahead, if nothing else consider it a "scream test". But can >> you take a moment and tell us which stakeholder group(s) you think >> you're optimizing for, why, and how? > > On the technical level we optimize using real (anonymized!) traffic > provided to us by operators. Here's what we need: > https://kb.isc.org/docs/collecting-client-queries-for-dns-server-testing > > If you want us to optimize for your use-case let's talk how we can get > the data and replicate your setup! I run Dnstap (for $reasons), but I'd be able to run dnscap and from the look of that KB page you only want the queries. I'm not sure that really captures the qualitative issue(s). I plan to dig into this some more over the winter anyway, maybe I should turn the tables and ask if there are other systemic issues I should look at or for? I'm using DNS largely for purposes other than FQDN -> address mapping. The things I've written have gotten enough uptake that I'm past the "kook" stage and into the "conspiracy" stage, but although I get some feedback at this point it's all basically anecdotal I don't have a "movement" that I can ask for disciplined feedback. I've done a number of different things poking at the same elephant over the past few years, and what I consistently see is a focus on "a query and a response" and I'm not sure that that is adequate systems thinking for the issues at hand. There seem to be a number of them, and they all point to inadequate systems thinking. That happens. As a neighboring example, adding more packet buffering to routers and wifi hotspots should be an unambiguous Good Thing, right? Even a decade after finding out that it's not, there are still people and constituent groups which haven't gotten the memo. The key thing I'm going to set up and examine this winter is the impact of qname minimization. But there are enough of these maybe some sort of memo is in order. Maybe somebody else wants to work on it with me? So here are some things which I've noticed about DNS in the field and lack of systems thinking. The first two (frags and TC=1) are fairly well known, and are provided as known examples where systems thinking is weak and what this means. But most importantly: "systems thinking in the DNS is provably weak". Frags. Frags are good? No they are bad. If a single UDP frag isn't delivered, the packet can't be reassembled. The server thinks all is fine and good and Procrustes' algorithm has made it all fit. The packet failing to be reassembled means that at the application layer no reply was received from the server. It really doesn't matter whether TC=1 is set or not, because it will never make it to the application. If traffic shaping mistakenly and simplistically thinks "dropping UDP is ok" it is double good for UDP frags. TC=1 is permission-based; (different implication) what if it only works over TCP? There is no provision in the algorithm to try TCP if no response is received via UDP. The 1980s recursion algorithm makes the decision to use TCP a polite society thing. The querant doesn't just try it. It waits for the server to say "here you are, this is what I can do for you; but I encourage you to please try again with TCP" and the querant thinks "oh how nice of you, what an excellent idea; thank you I will". There is no provision in the algo to unilaterally try TCP when UDP has failed to perform well or at all. This is arguably most important for stub resolvers. If the issue was simply buffer bloat, then forcing queries over TCP wouldn't provide observably better performance (which is often the case and why this is worth mentioning). The suspicion has to be traffic shaping, but I don't know that that's the case; crappy SOHO routers are largely black boxes. As an aside: are people still blocking TCP/53? Wasn't that long ago when this was conventional security theater. Aggressive UDP retry presumes fast over correct responses, or at least "correct enough" even if not the most timely. In pursuit of happy eyeballs, speed over everything else! The fastest thing is a static zone file which never changes. But the real world today encompasses forwarders as well as database backends (and this is for FQDN -> address mappings!) and in the quest for the fastest possible response caches get built on top of the database so that something can be served meeting the objectives of what is measured (response time). Without going into technical details, please accept that this increases complexity and the work needed to be done to keep what's served to the querant as fresh as practicable. On the other hand if a typical response time of 1/10th of a second is acceptable, there's time to wait for the database and no need for the additional complexity. Some datastores might take even longer than th
Re: Deprecation notice for BIND 9: "resolver-nonbackoff-tries", "resolver-retry-interval"
On 07. 12. 23 22:12, Fred Morris wrote: I welcome birds of a feather. Need to define / refine the problem statement first. On 12/7/23 12:30 AM, Petr Špaček wrote: On 07. 12. 23 1:05, Fred Morris wrote: On Wed, 6 Dec 2023, Evan Hunt wrote: I say go ahead, if nothing else consider it a "scream test". But can you take a moment and tell us which stakeholder group(s) you think you're optimizing for, why, and how? On the technical level we optimize using real (anonymized!) traffic provided to us by operators. Here's what we need: https://kb.isc.org/docs/collecting-client-queries-for-dns-server-testing If you want us to optimize for your use-case let's talk how we can get the data and replicate your setup! I run Dnstap (for $reasons), but I'd be able to run dnscap and from the look of that KB page you only want the queries. I'm not sure that really captures the qualitative issue(s). I plan to dig into this some more over the winter anyway, maybe I should turn the tables and ask if there are other systemic issues I should look at or for? We are certainly interested in hearing what metric is of interest and what we might be missing. Right now we monitor everything we can from statistics provided by DNS Shotgun [1], BIND statistics channel [2], and system resource monitoring [3]. [1] https://dns-shotgun.readthedocs.io/en/stable/ [2] https://bind9.readthedocs.io/en/v9.19.18/reference.html#namedconf-statement-statistics-channels [3] https://gitlab.isc.org/isc-projects/resource-monitor/-/blob/main/resmon.yaml -- Petr Špaček Internet Systems Consortium -- Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users