On 12/11/2017 8:03 PM, Wes Hardaker wrote:
Michael StJohns <m...@nthpermutation.com> writes:

Hi Mike,

Thanks for explaining your thinking because I think, after reading it:
we're actually in agreement but using different terms for where to put
in the slop you're worried about.

Specifically:

A perfectly operating resolver with perfect clock and perfect
connectivity and no outages MIGHT possibly keep a perfect interval
between each query it makes (making your activeRefreshOffset
meaningful), but 10000 resolvers ALL keeping perfect intervals?
Yes, I agree.  But, this is why I want the majority of the equation to
be defining the mathematical perfect certainty.  And then *after* that,
add the operational slop factor (safetyMargin) to account for both
problems and reality (you forgot to add "speed of light issues" in your
text above, for example ).
(sigh - safety factor deals with speed of light issues....DUH)


No, no, no, no, no.

Thus, I break the equation into two critical parts:

addWallClockTime = lastSigExpirationTime
                    + addHoldDownTime
                    + activeRefresh                   ^
                    + activeRefreshOffset             |
                                                      |
Precise Math                                         |
-----------------------------------------------------|
Needed Fuzz                                          |
                                                      |
                    + safetyMargin                    |
                                                      v

If you'd reorder this properly, you can probably get the right answer -  For the first part of the discussion assume no drifts or retransmits between (1) and (2) and between (3) and (4).

1) T == lastSigExpirationTime  and microscopically after this the time that the server sees the first query from any resolver starting their trust anchor installation. 2) T + activeRefresh  is the time at which the server sees the last query from the last resolver just starting their trust anchor installation. 3) T + activeRefresh + addHoldDownTime is the time at which the server sees the first query from any resolver finalizing its trust anchor installation. 4) T + activeRefresh + addHoldDownTime + activeRefresh is the time at which the server sees the last query from the last resolver finalizing its trust anchor installation.

(1) is the earliest time any resolver can start its installation (assuming an attack) because its also the time when all of the old signatures expire. (2) is the time at which a resolver who had its last activeRefresh just before T (and because of that wasn't able to start its installation) will send its first installation query. Between (2) and (3) any given resolver may drift/retransmit with the result that any given resolver may end up making a query just before (3) placing its next and final query at (3) plus activeRefresh.

Finally, to deal with drift and retransmits between (1) and (2), and between (3) and (4) we add a safetyFactor.    That deals with about 99.9999% of drift and retransmits but will never deal with the servers that have been offline or otherwise unable to get their queries completed.  The retransmits in a given clients addHoldDown period only really move the end point for a given resolver and don't affect the overall safetyFactor of the set of resolvers.


IE, if a perfect resolver hitting a RFC5011 zone with an activeRefresh
that evenly divides into 30 days:

   1) queries at T--- = lastSigExpirationTime - .000001
   2) queries at T+1--- = lastSigExpirationTime - .000001 + activeRefresh
Yes.
   3) Notes that it just saw a new key (assuming worst case #1 is replayed)
   4) starts timer
   5) will query again at lastSigExpirationTime + 30 days - .000001
No - from the servers point of view, the worst client (which is the only one the server cares about) will make its last query before trust anchor installation at lastSigExpirationTime + activeRefresh (when the last CLIENT saw its first valid update)  + 30 days -.0000001.
   6) notes this is still in waiting period
   7) will query again at lastSigExpirationTime + 30 days - .000001 + 
activeRefresh
Nope.   The worst client will query again at (from the servers point of view) lastSigExpiration + activeRefresh + addHoldDown (30) + activeRefresh

From a given client's point of view the last query can happen anywhere from (lastSigExpiration + addHoldDown + .00000001) to (lastSigExpriration + activeRefresh + addHoldDown + activeRefresh).   The server only cares about the worst (latest) case.


   8) now notes that it's been 30 days and accepts key

There is only 1 activeRefresh in that sequence.  And that's what's in
the equation.  Because the timing distance between #7 and #2 is still 30
days when queried to the perfect sub-nano second.

Nope.  Not from the servers point of view.



Then there should be a bunch of delays inserted, network timeouts, etc.
That's where the safetyMargin should come in and catch all the issues
with the impreciseness of the real world.  Now, if you want to add an
activeRefresh to the already defined safetyMargin suggested term, I'm
willing to consider that.  But it shouldn't be listed as part of
anything but the slop term for security analysis clarity.

Would you like to add more time to the safetyMargin to deal with the
non-perfect world, including clock drift because of time delays in a
bunch of queries back to back or any other reason?


Ending note about the precise timeline: when 30 days isn't divisible by
the activeRefresh, then you need to add the other term we haven't talked
about much which is the activeRefreshOffset which accounts for this
case.

And again. NO.  The retransmits over a given set of clients in the addHoldDown period will result in at least one client (the "worst" client) ending up making a query just before the expiration of ITS addHoldDown timer.   Assuming the worst case of at least one client making a query just before the lastSigExpirationTime and that same client drifting/retransmitting enough to make a query just before its addHoldDown time the activeRefreshOffset is a useless value to calculate.

Later, Mike


Cheers,
Wes


_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to