Hi,

I'm new to posting on this list, so please accept my advance apologies if I
make any novice errors or posted this in the wrong place.  Apologies also
for the long email. :-)

I can't claim to have the same detailed knowledge of the protocol as the
authors of this draft.  All the same, I've been mulling this
child/parent-centric resolver question for a while and watching its impact
on our customers and developers.  This draft seems to resolve this question
with the conclusion that child-centric (non-sticky?) is the correct
behaviour.

Shumon Huque:
> There is a range of different behaviors in resolver implementations
> in this respect today, and it would be good if we could agree on
> more commonality.

I agree.  Having a predictable standard behavior (at least for recent,
well-behaved resolvers) is very desirable.

The recent thread on the DNS OARC list seemed to frame this question as a
trade off mostly of these points:

[a] https://tools.ietf.org/html/rfc2181#section-5.4.1 saying the in-zone NS
is authoritative and more trustworthy.  (pro: child-centric)
[b] flexibility for DNS operators to lower the effective TTL on a
delegation during changes despite registries fixing their TTL.   (pro:
child-centric)
     Paragraph 4 in
https://tools.ietf.org/html/draft-huque-dnsop-ns-revalidation-01#section-2
[c] the additional complexity resolvers will have to bear  (pro:
parent-centric)
[d] making resolution deterministic  (pro: parent-centric)

I'd like to draw attention to a fifth item which I haven't see addressed.

[e] obeying the principle of least astonishment for mortal DNS operators
who do not understand this subtlety (and who I assume are the overwhelming
majority).  (pro: parent-centric)

The child-centric resolver behaviour applied by many resolvers today is
probably clear to most who reads this list and DNS OARC.   But it's very
counter-intuitive to everyone else.  In my experience the overwhelming
majority of people operating DNS do not understand this very subtle point.
They all tend to assume the parent-centric behaviour.  Perhaps that's
because many of them have a software developer background and are familiar
with tree data structures and linked lists.  Put another way, child-centric
resolvers effectively insist on there being two sources of truth for a
delegation, which is very surprising.

My organization tends to operate with every team being enabled (and
expected) to own their own service, including its DNS, so we have a lot of
software developers and others who don't have time or inclination to read
DNS RFCs and books dealing with DNS but still need to do it.  We delegate
domains at multiple levels to give people that autonomy.  We're also a DNS
vendor for many public customers.  I've seen lots of outages and security
issues either caused by or prolonged by people surprised by child-centric
resolver behaviour.   So much so, that I am inclined to prefer
parent-centric.

[a] seems like a definition which could be changed if it was so decided.
DS records are totally parent centric for example.  It seems like NS could
be too if we declared the in-zone NS to be "informational only".  I
understand the desire for [b], but it seems to propose to dictate a
specific behavior at every level of delegation in order to work around a
problem that only exists with second level delegations (those managed by
registries).  In so doing, it optimizes for an arcane but admittedly useful
flexibility that only a tiny minority of DNS operators will ever understand
how to use.  At the same time, we'd be standardizing on behavior that
surprises the majority of operators and at times causes them (or at least
leads them into) outages, even when they are working at a 3rd or 4th level
delegation (ie not a registry).

There are two categories of unfortunate "surprises" we commonly see due to
child centric resolvers.  The first surprise is that when redelegating a
third level domain, it's obvious to moderately experienced operators that
they must lower the parent NS TTL in order to get a fast rollback, but as
they don't realise the in-zone NS takes over, they don't lower that TTL.
Now their fast rollback plan is ineffective on child-centric resolvers.
It's great to see in this draft that the "delegation revalidation" section
of the draft seems to solve that sharp edge by choosing the minimum TTL at
the delegation.  If we conclude we must have child-centric behaviour, this
at least makes it safer than today.   But still the misunderstanding points
to the surprising behaviour of child-centric resolvers.   The second
surprise category are a variety of subtle misconfigurations which we have
seen at the in-zone NS which operators don't understand (copying the NS
from the previous zone, altering the NS in an effort to get a full sideways
delegation, just plain errors, etc).  When we explain these problems, our
customers say "but the child NS isn't used for delegation, what are you
talking about?".   "dig +trace" also ignores the in-zone NSes (that could
be fixed of course, but it reinforces how people think about delegations).

It probably shouldn't drive the decision, but choosing parent-centric will
also remove the problems outlined in the draft around non-compliant dns
implementations which don't answer NS or do so incorrectly.  That's not
critical, but it will make life easier.  The original reason those
implementations exist is very likely due to the odd "two sources of truth"
situation we're in.  If the in-zone NS becoms "informational only", that
non-compliance will probably continue, but it will become less of an issue.

So, to summarize, compared with today, I think the draft as written would
be an improvement as it will bring consistency.  Delegation revalidation
will also remove a sharp edge which causes problems for less experienced
DNS operators in a world of child-centric resolvers.  However, on balance,
I would suggest it would be better to optimize for DNS delegations which
are "intuitive for the majority of dns operators" instead of "flexible for
a (tiny?) minority".  I think parent-centric is a better choice for DNS and
I don't think registries should force the DNS standard into making a
Sophie's Choice like this.

I hope this helps,
Gavin

PS How truly intractible is the registry argument?  It seems something like
"When an NS change is made, TTL=3600 for the first N hours, then 2 days
thereafter." would be a major step forward without drastically increasing
complexity.



On Tue, Apr 14, 2020 at 8:24 AM Bob Harold <rharo...@umich.edu> wrote:

>
> On Mon, Apr 13, 2020 at 4:59 PM Shumon Huque <shu...@gmail.com> wrote:
>
>> On Fri, Apr 10, 2020 at 12:51 PM Bob Harold <rharo...@umich..edu
>> <rharo...@umich.edu>> wrote:
>>
>>> Having read through the draft, and twice through the emails, I think the
>>> draft has the right balance in using the parent and child NS RRsets
>>> properly.
>>>
>>> I think the "extra" query for the child NS, sent once per parent TTL, is
>>> a savings over the older method of sending the NS records as "additional
>>> data" in every response.
>>>
>>>>
>>>>
>> Thank you Bob. I just want to qualify your last observation a bit - the
>> extra query would be once per minimum of parent and child TTL, roughly.
>>
>> Shumon.
>>
>
> Ah, yes, thanks for the correction.
>
> --
>
> Bob Harold
>
> _______________________________________________
> DNSOP mailing list
> DNSOP@ietf.org
> https://www.ietf.org/mailman/listinfo/dnsop
>
_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to