Re: [DNSOP] New draft on delegation revalidation

Petr Špaček Thu, 28 May 2020 07:39:23 -0700


On 25. 05. 20 5:23, Shumon Huque wrote:
> On Thu, May 21, 2020 at 8:24 AM Petr Špaček <petr.spa...@nic.cz 
> <mailto:petr.spa...@nic.cz>> wrote:
> 
>     >
>     >    https://tools.ietf.org/html/draft-huque-dnsop-ns-revalidation-01
> 
>     I would appreciate a practical example of changes envisioned in the 
> following paragraph:
> 
>     >    A common reason that zone owners want to ensure that resolvers place
>     >    the authoritative NS RRset preferentially in their cache is that the
>     >    TTLs may differ between the parent and child side of the zone cut.
>     >    Some DNS Top Level Domains (TLDs) only support long fixed TTLs in
>     >    their delegation NS sets, and this inhibits a child zone owner's
>     >    ability to make more rapid changes to their nameserver configuration
>     >    using a shorter TTL, if resolvers have no systematic mechanism to
>     >    observe and cache the child NS RRset.
> 
>     Could someone please post an example in steps? Something like:
>     - time 0, NSSET parent = {P0}, NSSET child = {C0}
>     - time 1, NSSET parent = {P1}, NSSET child = {C1}
>     ... along with textual description what operator is hoping to achieve?
> 
> 
> I'll try to come up with a step by step example later. But in the example
> cited, what the operator is trying to achieve is to change their nameserver
> configuration (e.g. switch to another set of nameservers for their zone) in
> such a way that (1) they can make that change visible to resolvers 
> reasonably quickly, and (2) to make sure they are able to backout that
> change quickly if things go wrong. They can do this by lowering the TTL
> in the child NS set which is under their control -- assuming that resolvers 
> are
> preferentially caching the child NS set, in accordance with the data ranking
> rules of the DNS protocol (the child NS set is authoritative).
> 
>     Ad 4.  Delegation Revalidation:
> 
>     I agree with author's note "we would prefer to discard the extensive 
> mechanism" but the simple mechanism has simple description for me to 
> understand consequences.
> 
> 
> I've heard the same from other implementers too. Paul V has mentioned that 
> there are some subtle corner cases that are dealt with more precisely by the 
> extensive algorithm (I can't recall the details right now, but maybe he will 
> elaborate for us). Even if we end up on the simple path, it would be good to 
> have a better understanding of those cases.
> 
>     >    The simple mechanism:
>     >
>     >    o  Cap the time to cache the child NS RRset to the lower of child and
>     >       parent NS RRset TTL.  The normal iterative resolution algorithm
>     >       will then cause delegation revalidation to naturally occur at the
>     >       expiration of the capped child NS TTL, along with dispatching of
>     >       the validation query to upgrade NS RRset credibility.
> 
>     So far so good, but it does not specify what should happen with RRsets 
> other than NS. Even if nothing is prescribed please state that explicitly.
> 
> 
> Thanks, yes I agree we should discuss all of these. Some quick thoughts for 
> now ..
> 
>     Most importantly:
>     - Does the NS affect maximum TTL of _other_ data in the zone?
> 
> 
> I think there are probably different views on what should happen here. Folks 
> who want very prompt takedown of "bad" domains, will probably prefer a 
> complete pruning of the cache at the delegation point at the revalidation 
> interval, if the NS set has changed or disappeared. When this topic has come 
> up in the past, there has been pushback from some implementers that it's 
> difficult to do this because they use a non-tree data structure for the cache 
> (a hash table most commonly). Most resolvers these days already enforce a max 
> cache TTL parameter, so that typically prevents too much abuse. But at the 
> very least, they should probably use the revalidation interval as a signal to 
> stop "pre-fetching" records below the cut.
> 
>     - If it does, doesn't it increase risk of thundering herd behavior?
> 
> 
> Possibly, depending on how popular the zone is, and what we decide is the 
> answer to the previous question. At any rate, implementers should always 
> employ strategies that bound how much work resolvers can be caused to do.


I'm not concerned about any single resolver instance, I'm more concerned about 
large number of resolver instances doing the same thing at the same time.

E.g. if NS TTL was short (say 30 s) and it was used as cap on TTL of all other 
records in the zone, then each resolver instance would clear zone from its 
cache each 30 seconds. That might cause interesting behavior when NS TTL is 
shortened e.g. before NS set change etc.

I do not know if there really is a problem, I'm just trying to explain why 
potential for thundering herd needs to be be seriously analyzed.


> 
> It might be reasonable to suggest that resolvers enforce a lower limit on the 
> child NS TTL (5 minutes?). If they see something less than that, it would be 
> set to the limit instead.
> 
>     - If it does not, is it even worth the effort if attacker can put week 
> long TTL for A/AAAA and keep using that?
> 
> 
> Answered above ... 
>  
> 
>     - How should resolver handle RCODE=NXDOMAIN? Should it have different 
> effect than changing NS set to different set of servers?
> 
> 
> If resolvers are following RFC 8020 strictly, they are pruning their cache at 
> the delegation - that would be the ideal. Otherwise, they should allow cached 
> records below to live on, subject to max-cache TTL and disabling of 
> pre-fetching.
> 
>     Or change in DS record value?
> 
> 
> Which value? TTL or RDATA?
> 
> DS TTL expiration would automatically trigger a revalidation of the child SEP 
> keys at the parent. RFC 4035 says DS and delegating NS TTL SHOULD match, so 
> NS revalidation should happen on the same time scale. In reality though, they 
> are often different (COM/NET uses 2d for NS, 1d for DS). if NS > DS, a 
> resolver could just decide to explicitly fetch the expiring DS first and wait 
> out the delegating NS TTL if the DS rdata has not changed. But it seems much 
> simpler to just say that  if there is a secure delegation, then the DS TTL 
> should be used as the NS revalidation interval too.
> 
>     For me personally mixing two problems (GHOST domains and NS 
> inconsistency) in single proposal does not help me to understand reasoning 
> behind the proposal and its intended effects.
> 
> 
> Ok, we'll think about how to make this clearer. But the two topics are 
> related. If resolvers prefer the authoritative child NS set, then timely 
> revalidation of the delegation at the parent is also necessary.

Yeah, you are right. It is probably good idea to discuss both at once because 
it will force us to find compromise between the two. E.g. most aggressive 
approach where resolver prunes cache on mere NS set _change_ might increase 
fragility when someone makes mistake while doing NS set change etc.

Having said that, I still think it would help if arguments/requirements for 
each use-case (GHOST vs. NS inconsistency / malicious vs. well-behaved 
operator) were laid out separately and then document reconciled arguments from 
these two camps and reached single recommendation for resolver implementers.

-- 
Petr Špaček  @  CZ.NIC

_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Re: [DNSOP] New draft on delegation revalidation

Reply via email to