On 12/7/2017 7:53 PM, Wes Hardaker wrote:
Michael StJohns <m...@nthpermutation.com> writes:

Much improved - but still some disconnects (all review is de novo):
That's Mike.  All good comments.  I've attached responses and actions
(or inactions) below and will push a new version shortly as well.

Wes Hardaker


Table of Contents
_________________


4 TODO in 3 - sigExpirationTimeRemaining - two items:
=====================================================

   "latestSigExpirationTime" -> lastSigExpirationTime.  and measured from
   when?   I think its "when the addWaitTime calculation is run" or
   "lastSigExpirationTime - now"

   + Response: I think it's fairly self evident from the text that it's
     "when used", which is indeed at least during addWaitTime
     calculation.  But it's more conceptual and 'the amount of time
     remaining' I think is pretty clear to mean "from now".  I thought
     about adding something like "from now" into the sentence, but that
     didn't seem better to me and added unneeded or complexity to the
     sentence.  Suggestions?





6 WONTDO Delete 6.1.4 - activeRefreshOffset -  its a nonsensical value that is
==============================================================================

   only valid from the resolver's point of view.   For a given
   publisher/authoritative server - there will be as many
   activeRefreshOffsets as there are resolvers so the publisher must
   assume the worst case of activeRefresh.

   + Response: I disagree here.  This was put in after a discussion with
     Matthijs Mekking to specifically address the case where the polling
     period may not end up right in time-step with the addHoldDownTime.
     The end result is that when a RFC5011 resolver is polling at a
     different frequency based on the activeRefresh time.  We are
     defining a minimum mathematically defined safety value and this
     value is calculatable, so it should remain.  There is a hint of
     guidance text that says you can set it to activeRefresh for
     simplicity if you want.  But the math-line is before a full second
     activeRefresh.  But the value itself needs to be included to account
     for the odd frequency slippage.

     TL;DR: this isn't a guidance document, it's a mathematical line
     document


  The problem is that the polling period is a resolver concept, not a publisher concept.   And there are as many different polling intervals (all consisting of different start times but with lengths approximately equal to the activeRefresh interval) as there are resolvers.

Look.   Publisher publishes at time A -.000000001 (or any time before this) where A is the last expiration time of the old records.  Resolver First does an active query at A+.00000001, Resolver Last does a query at A + activeRefresh interval - .000000001.  All other well behaved resolvers will have made a query at or after the first resolver and at or before the last resolver (with a safety slop added in for lost packets and retransmits done under the fast query rate).    Resolver Last will then (along with all the other resolvers) wait at least addHoldDown time and then make one last query.  The relationship to when that last query is made to the resolver's perception of when the addHoldDownTime expired  can be different for each resolver, with the sole requirement that the resolver waited for the next query after the addHoldDownTime expired.

A perfectly operating resolver with perfect clock and perfect connectivity and no outages MIGHT possibly keep a perfect interval between each query it makes (making your activeRefreshOffset meaningful), but 10000 resolvers ALL keeping perfect intervals? NWIH.   So for the purposes of this we assume that ALL resolvers addHoldDownTime expired just after their last query meaning that we assume that ALL resolvers waited at least another queryInterval after the addHoldTime before making the query that triggers trust anchor installation.

the calculations represent is the worst case for any give resolver, but actually the overall best case numbers for the publisher.   The safetyFactor deals with the clock and query drifts from the large population of resolvers and moves the publisher slightly off their best case estimate.

So the total interval from after the last RRSig expiration of the old stuff is  activeRefresh (the latest query from all well behaved resolvers after the last expiration) + addHoldDownTime (.0000001 after this is when the first resolver installs the trust anchor) + activeRefresh (when the well behaved and connected resolvers have installed the new trust anchors) + safetyFactor (when the guys who dropped a few packets installed during the process the trust anchors).

You got it wrong.  It's not a matter of disagreement about whether this is guidance  vs math, its about choosing the parameters in a way that's meaningful from the publisher's point of view.  You picked data that is only meaningful in the context of a single resolver, but which reduces to activeRefresh for a large population of resolvers.

Don't make it more complex than it needs to be.




7 WONTDO 6.2.1 - replace activeRefreshOffset with activeRefresh - worst case
============================================================================

   value. (Wait Timer Based Calculation)


8 WONTDO fix 6.2.1.1 delete the term for addHoldDownTime % activeRefresh - the
==============================================================================

   "2 * activeRefresh" in the previous term covers both the activeRefresh
   interval at the beginning of the acceptance period and the
   activeRefresh interval at the end.


9 WONTDO In 6.2.2 - same changes as for 6.2.1 and 6.2.1.1 (e.g. get rid of
==========================================================================

   activeRefreshOffset throughout).


   ,----
   |                         v          activeRefresh v
   | addHoldDownTime                 v    activeRefresh v  safetyMargin  v
   |
   | 
|-----------------------|-------------------------------------|-------------------------------------|--------------------------|----------------|
   |
   |  lastSigExpirationTime^^^                 acceptanceStarts ^^^
   | acceptance begins to complete^^            earliestSafe^^^
   `----

   After the second activeRefresh interval all of the well behaved and
   well connected resolvers should have the new trust anchor. The
   safetyMargin adds some space for poorly behaving or intermittently
   connected resolvers or those with some drops in queries.


10 DONE Section 6.3 has one too many activeRefresh terms in both formulas -
===========================================================================

   here are the corrected ones:

   remWaitTime = sigExpirationTimeRemaining + activeRefresh +
   safetyFactor

   remWallClockTime = lastSigExpirationTime + activeRefresh +
   safetyFactor

   Basically, assuming no attacker, and no drops, all well-behaved
   resolvers will see the revocation after one activeRefresh interval
   from the time of publication.  Add the safety factor to take care of
   the slackers.  This is a fine value for normal revocations where
   you're pretty sure that the key hasn't been compromised.

   There is no hold-time timer for revocation - they take effect
   immediately upon receipt and validation.


11 WONTDO Guidance about key compromise
=======================================

   In the case of a key compromise, I would suggest that the revoked key
   be published for the same interval as you would use for adding a new
   trust anchor.  (But of course, this won't actually matter all that
   much if you only have a single trust anchor....)

   + Good advice to put in an general advice document in the future

Yes and no.  How you do revocation depends on why you are doing revocation and what the downside of the revocation is if you don't get it right.  In the one in/one out scheme that the root uses you can use the shorter publication period. If you're dealing with a compromised key you need different guidance.  Both of these are guidance for the publisher based on different input conditions. Ignoring one over the other makes little sense. The math i gave you is correct for the input conditions you specified.  But the aren't the only possible input conditions.  This document started because you believed that the publishers weren't making the right assumptions about publication intervals and needed some guidance to explain what they needed to think about.  This seems to be the same class of misguidance.  If a publisher uses this guidance for a compromised key they might (probably will) be making a mistake.

Later, Mike



12 DONE Appendix A - fix the calculations to match up with the section 6 
formulas.
==================================================================================


_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to