Responses inline.
On 4/11/19 18:50, Matthew Pounsett wrote:
On Wed, 10 Apr 2019 at 16:43, Richard Gibson
<richard.j.gib...@oracle.com> wrote:
The first problem is for the owner of the ANAME-containing zone, for whom the
upstream misconfiguration will result in downtime and be extended by caching to
the MINIMUM value from their SOA, which in many cases will be one to three
orders of magnitude greater than the TTL of the ANAME itself.
I think I'm missing something here. If, for example, the TTL of the
ANAME is 1 hour, what mechanism results in caching holding onto a
negative answer for a broken target name for a minimum of 10 hours,
and to 40 days?
Demonstrative example zone:
example.com. 3600 IN SOA ns.example.net. hostmaster.example.net. 1 (
7200 ; REFRESH
3600 ; RETRY
86400 ; EXPIRE
3600 ); MINIMUM
example.com. 60 IN ANAME example.invalid.
example.com. 60 IN A 192.0.2.1
When an ANAME-aware resolver queries an ANAME-aware authoritative server
for example.com. A, it will receive the A record in the answer section
and the ANAME in the additional section. If it then chases the ANAME
target to an NXDOMAIN and accepts that as justification for replacing
the sibling A RRSet with nothing as currently specified in the draft,
then the appropriate response will be a Type 2 NODATA in which the
answer section is empty and the additional section contains the SOA. But
this suffers from both of the problems I have been complaining about—the
resolver does not necessarily /have/ the zone SOA, possibility
necessitating an inline lookup, and per RFC 2308 the negative response
will be cached according to values from the SOA that are unrelated to
and far exceed the TTL of the ANAME.
Both of these problems can be addressed by allowing/recommending/requiring
ANAME-aware servers to preserve ANAME siblings when resolution of ANAME targets
results in NXDOMAIN or NODATA responses, rather than replacing them with an
empty RRSet... which, to be honest, seems to be always-undesirable behavior
anyway—if anyone can think of a scenario where it would be beneficial to
dynamically remove ANAME siblings, please share it.
I feel like this is creating an even bigger potential problem. What
happens when the owner of the ANAME target legitimately wants that
name to go away, but some other zone owner is leaving an ANAME in
place pointing to that now-nonexistent name? Continuing to serve the
sibling records indefinitely seems like serve-stale gone horribly
wrong.
In such a configuration, the owner of the ANAME will be able to see that
clients are using its static sibling records rather than its target (and
therefore that they are getting no benefit from the ANAME), and can
react accordingly. If your concern is excess queries for the ANAME
target, then this seems no different from e.g. CNAME—the owner of the
target can issue long-lived negative responses while performing whatever
other exploration and/or mitigation they deem fit.
But this seems like it will be much more rare and frankly much less of a
problem than stretching out misconfiguration at an ANAME target into
extended downtime for an ANAME owner. It must be possible for the latter
to execute a recovery plan as quickly as possible, and if ANAME is
specified well then that the first step of recovery can be literally
instant and automatic.
_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop