On Thu, Sep 7, 2017 at 10:07 PM, Mark Andrews <ma...@isc.org> wrote:

>
> Part of the problem is that we have one TTL value for both freshness
> and don't use beyond.
>
> This is fixable.  It is possible to specify two timer values.  It
> does require adding signaling between recursive servers and
> authoritative servers, on zone transfers and update requests.
>
> You basically add a additional timer field to every record immediately
> after the TTL field.  This is only returned if the client has
> signalled support for the extended field, I suggest using the last
> DNS header bit for this as you can determine how you will parse the
> response base on whether the bit is set in the response or not.
> This field is used to expire records from the cache and its value
> is set to the TTL field if the server has learnt the record from
> server that doesn't support the extension.
>
> The existing TTL field is used for freshness checking.  When a query
> comes in after that value has expired a freshness check is performed
> similar to the existing prefetches that happen today.  A TTL of 1
> is returned unless the original TTL was 0 in which case 0 is returned.
>
> New client - new recursive server - new authservers
>
>         example.com. 300 86400 IN A 1.2.3.4
>
>                 +300 seconds
>
>         example.com. 1 86100 IN A 1.2.3.4
>          (background query is in process)
>
> Old client - new recursive server - new authservers
>
>         example.com. 300 IN A 1.2.3.4
>
>                 +300 seconds
>
>         example.com. 1 IN A 1.2.3.4
>          (background query is in process)
>
> New client - new recusive server - old auth servers
>
>         example.com. 300 300 IN A 1.2.3.4
>
>                 +300 seconds
>          (record has expired from cache,
>           new query is performed)
>
>         example.com. 300 300 IN A 1.2.3.4
>
> For UPDATE a replacement opcode would be cleanest way to signal the
> new format is being used.  NOTIMP should be returned by servers
> that don't support the new opcode.
>
> There will be a few broken servers that just echo back the new
> header bit.
>
> This way the authoritative servers still control how long records
> are stored for.  Dead servers will get a little bit of traffic until
> the the refresh completes.  If the authorative servers are under
> attack the clients still see a answer.
>
> The alternative is to perform the refresh query and if it fails to
> complete within X milliseconds return the cached data rather than
> returning the cached data and doing the refresh in the background.
>
> Mark
>
> --
> Mark Andrews, ISC
> 1 Seymour St., Dundas Valley, NSW 2117, Australia
> PHONE: +61 2 9871 4742                 INTERNET: ma...@isc.org
>

While I like the idea of a  "don't use beyond" timer, I think it will be a
very long time before it is widely deployed (and actually configured by
zone owners), and therefore won't solve our immediate need.  It would be
great if clients could opt-in, but again I don't see that happening anytime
soon.  So I would start with resolver-operators deciding what seems best
for their clients (which is hat is happening whether we like it or not).
Adding client opt-out/opt-in would be good.   Signalling to say that a
response is stale would be good.  Adding the second timer (both per-RR and
as a zone default value, like TTL is handled) would be good.

On a related note - the SOA "expire" timer tells a slave how long to keep
serving "stale" zone data when the master cannot be reached.  Would that be
a reasonable default value for how long a resolver should serve "stale"
data when the authoritative servers cannot be reached?   (Currently I think
most people set a very high value compared to the TTL.)

-- 
Bob Harold
_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to