On Mon, Jul 24, 2023 at 10:00:37AM +0530, Mukund Sivaraman wrote: > When seeing prescriptive text, implementors often wants to know the > rationale behind it. If the value of 5 is changed to 1, please mention > and have the authors include in the document why the lower limit is > 1s. Is it an arbitrary change? Is this change based on the default value > of BIND's servfail-ttl named.conf option?
Yes, it is. For background: BIND implemented a SERVFAIL cache in 2014 with a default cache duration of 10 seconds; after a slew of complaints, in 2015 we lowered it to 1 second, and also reduced the configurable maximum from 5 minutes to 30 seconds. The reason was that certain common failure conditions are transitory, and it's not unreasonable to prioritize rapid recovery. Now, to be clear, the comparison isn't exactly apples to apples: the BIND SERVFAIL cache is a somewhat stupider mechanism than the one outlined in the draft. It caches *all* SERVFAIL responses, regardless of the reason they were generated. For example: when the cache is cold, a query may time out or hit DDoS mitigation limits before it's finished getting through the whole iteration process; an immediate retry would start further along the delegation chain and would succeed. Such problems weren't noticeable until we implemented the 10-second cache, but became very noticeable afterward. If we were able to selectively cache *only* those SERVFAILs that are unlikely to recover soon, then five seconds might indeed be a good starting point. But, with our relatively dumb cache, we found that one second did a fairly good job reducing the processing burden from repeated queries, and eliminated the user complaints about the resolver taking forever to recover from short-lived problems. It's been working well enough that it hasn't been a priority to develop a more complex failure cache. In any case, even with the assumption that future implementations *will* have better selectiveness, I'm leery of using 5 seconds as hard minimum in an RFC. I think it's likely that some operators will find that excessive and want the option to tune it to a lower value. Also, if you *are* doing exponential backoff, then two failures in a row will get your duration up to 4 seconds anyway, so the difference between starting at 1 and starting 5 isn't really all that significant. > > * Note that the original text has this as SHOULD. I've heard reasons for > > both SHOULD and MAY. > > What are these reasons? I suggested MAY because I think exponential backoff is a pretty specific (and rather aggressive) approach to cache timing, and I'm not entirely comfortable with it having the almost-mandatory force of a SHOULD. The original text says a series of seven resolution failures would increase the duration before a retry to five minutes: 5 seconds to 10 to 20 to 40 to 80 to 160 to 300. Lowering the starting value to one second means it would take nine failures to reach 300. IMHO, keeping the recovery period flat, or increasing it linearly (5, 10, 15, etc), could also be operationally reasonable choices, so I'm not sure why we need to be so emphatic about *this* particular backoff strategy in the RFC. I have no objection to mentioning it, but it felt like a MAY to me. It's a mild preference though, and if I'm the only one who feels that way, I won't argue about it further. -- Evan Hunt -- e...@isc.org Internet Systems Consortium, Inc. _______________________________________________ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop