Hi, I'm new to posting on this list, so please accept my advance apologies if I make any novice errors or posted this in the wrong place. Apologies also for the long email. :-)
I can't claim to have the same detailed knowledge of the protocol as the authors of this draft. All the same, I've been mulling this child/parent-centric resolver question for a while and watching its impact on our customers and developers. This draft seems to resolve this question with the conclusion that child-centric (non-sticky?) is the correct behaviour. Shumon Huque: > There is a range of different behaviors in resolver implementations > in this respect today, and it would be good if we could agree on > more commonality. I agree. Having a predictable standard behavior (at least for recent, well-behaved resolvers) is very desirable. The recent thread on the DNS OARC list seemed to frame this question as a trade off mostly of these points: [a] https://tools.ietf.org/html/rfc2181#section-5.4.1 saying the in-zone NS is authoritative and more trustworthy. (pro: child-centric) [b] flexibility for DNS operators to lower the effective TTL on a delegation during changes despite registries fixing their TTL. (pro: child-centric) Paragraph 4 in https://tools.ietf.org/html/draft-huque-dnsop-ns-revalidation-01#section-2 [c] the additional complexity resolvers will have to bear (pro: parent-centric) [d] making resolution deterministic (pro: parent-centric) I'd like to draw attention to a fifth item which I haven't see addressed. [e] obeying the principle of least astonishment for mortal DNS operators who do not understand this subtlety (and who I assume are the overwhelming majority). (pro: parent-centric) The child-centric resolver behaviour applied by many resolvers today is probably clear to most who reads this list and DNS OARC. But it's very counter-intuitive to everyone else. In my experience the overwhelming majority of people operating DNS do not understand this very subtle point. They all tend to assume the parent-centric behaviour. Perhaps that's because many of them have a software developer background and are familiar with tree data structures and linked lists. Put another way, child-centric resolvers effectively insist on there being two sources of truth for a delegation, which is very surprising. My organization tends to operate with every team being enabled (and expected) to own their own service, including its DNS, so we have a lot of software developers and others who don't have time or inclination to read DNS RFCs and books dealing with DNS but still need to do it. We delegate domains at multiple levels to give people that autonomy. We're also a DNS vendor for many public customers. I've seen lots of outages and security issues either caused by or prolonged by people surprised by child-centric resolver behaviour. So much so, that I am inclined to prefer parent-centric. [a] seems like a definition which could be changed if it was so decided. DS records are totally parent centric for example. It seems like NS could be too if we declared the in-zone NS to be "informational only". I understand the desire for [b], but it seems to propose to dictate a specific behavior at every level of delegation in order to work around a problem that only exists with second level delegations (those managed by registries). In so doing, it optimizes for an arcane but admittedly useful flexibility that only a tiny minority of DNS operators will ever understand how to use. At the same time, we'd be standardizing on behavior that surprises the majority of operators and at times causes them (or at least leads them into) outages, even when they are working at a 3rd or 4th level delegation (ie not a registry). There are two categories of unfortunate "surprises" we commonly see due to child centric resolvers. The first surprise is that when redelegating a third level domain, it's obvious to moderately experienced operators that they must lower the parent NS TTL in order to get a fast rollback, but as they don't realise the in-zone NS takes over, they don't lower that TTL. Now their fast rollback plan is ineffective on child-centric resolvers. It's great to see in this draft that the "delegation revalidation" section of the draft seems to solve that sharp edge by choosing the minimum TTL at the delegation. If we conclude we must have child-centric behaviour, this at least makes it safer than today. But still the misunderstanding points to the surprising behaviour of child-centric resolvers. The second surprise category are a variety of subtle misconfigurations which we have seen at the in-zone NS which operators don't understand (copying the NS from the previous zone, altering the NS in an effort to get a full sideways delegation, just plain errors, etc). When we explain these problems, our customers say "but the child NS isn't used for delegation, what are you talking about?". "dig +trace" also ignores the in-zone NSes (that could be fixed of course, but it reinforces how people think about delegations). It probably shouldn't drive the decision, but choosing parent-centric will also remove the problems outlined in the draft around non-compliant dns implementations which don't answer NS or do so incorrectly. That's not critical, but it will make life easier. The original reason those implementations exist is very likely due to the odd "two sources of truth" situation we're in. If the in-zone NS becoms "informational only", that non-compliance will probably continue, but it will become less of an issue. So, to summarize, compared with today, I think the draft as written would be an improvement as it will bring consistency. Delegation revalidation will also remove a sharp edge which causes problems for less experienced DNS operators in a world of child-centric resolvers. However, on balance, I would suggest it would be better to optimize for DNS delegations which are "intuitive for the majority of dns operators" instead of "flexible for a (tiny?) minority". I think parent-centric is a better choice for DNS and I don't think registries should force the DNS standard into making a Sophie's Choice like this. I hope this helps, Gavin PS How truly intractible is the registry argument? It seems something like "When an NS change is made, TTL=3600 for the first N hours, then 2 days thereafter." would be a major step forward without drastically increasing complexity. On Tue, Apr 14, 2020 at 8:24 AM Bob Harold <rharo...@umich.edu> wrote: > > On Mon, Apr 13, 2020 at 4:59 PM Shumon Huque <shu...@gmail.com> wrote: > >> On Fri, Apr 10, 2020 at 12:51 PM Bob Harold <rharo...@umich..edu >> <rharo...@umich.edu>> wrote: >> >>> Having read through the draft, and twice through the emails, I think the >>> draft has the right balance in using the parent and child NS RRsets >>> properly. >>> >>> I think the "extra" query for the child NS, sent once per parent TTL, is >>> a savings over the older method of sending the NS records as "additional >>> data" in every response. >>> >>>> >>>> >> Thank you Bob. I just want to qualify your last observation a bit - the >> extra query would be once per minimum of parent and child TTL, roughly. >> >> Shumon. >> > > Ah, yes, thanks for the correction. > > -- > > Bob Harold > > _______________________________________________ > DNSOP mailing list > DNSOP@ietf.org > https://www.ietf.org/mailman/listinfo/dnsop >
_______________________________________________ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop