Tony Finch writes: > This sounds like it will lead to stale answers being given instead of > re-trying other potentially working servers.
The document is explicit that you need to keep trying to get an answer, so if an implementation is not retrying other potentially working servers that is its own defect. > I think serve-stale should only cover cases where servers are > unreachable or unresponsive. You are of course free to write your own implementation that way. Having worked for operations where the authorities were concerned about the possibility of accidental ServFails, I know that their preference is that resolvers would serve-stale then too and enhance the overall resiliency of the system. If you think it would help, I can add some text to Implementation Considerations about this, something like: Consider whether serve-stale should kick in for only the case of all servers being unresponsive, or whether authoritative servers responding with DNS RCODEs other than NoError and NXDomain also trigger it. Some authoritative servers operators would prefer stale answers to be used in the event of their server failures, while other implementers see any answer from the authoritative server as being sufficient indication that any previously available answer for the question is superseded. The implications of that are a transition from good answers to failure answers to unavailable means that the stale answers will never be available when they otherwise could have been, but so be it. > If all a zone's servers start to reply REFUSED, that's a deliberate > decision to disable the zone, and resolvers should not try to keep it > alive beyond its TTL. You cannot know that it is a deliberate decision to disable the zone. In fact, I have direct operational experience of why it's a terrible way to disable a zone. One of my own servers was slammed for queries in a zone I was not authoritative for. (It was a well-known zone, too, and one which is not DNSSEC-signed.) My server was dutifully returning Refused to the queries, and yet they kept coming very frequently, maxing the link. Arguably the clients should have applied the techniques of RFC 2308 for negative caching of those Refused answers, but it was not until I added the zone in question to send back an authoritative answer with a proper caching signal that the queries really went away. While it is obviously the best approach for a takedown to update the delegation, in the situation where you have a delegation pointing to server(s) that cannot be updated but you have control over the servers, it is far better to provide an affirmative answer to the question than to send Refused. Positive caching is much better understood than negative caching by the wide variety of DNS implementations out there. _______________________________________________ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop