Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-08.txt

Michael StJohns Thu, 07 Dec 2017 21:44:31 -0800

On 12/7/2017 7:53 PM, Wes Hardaker wrote:

Michael StJohns <m...@nthpermutation.com> writes:

Much improved - but still some disconnects (all review is de novo):

That's Mike.  All good comments.  I've attached responses and actions
(or inactions) below and will push a new version shortly as well.

Wes Hardaker


Table of Contents
_________________


4 TODO in 3 - sigExpirationTimeRemaining - two items:
=====================================================

   "latestSigExpirationTime" -> lastSigExpirationTime.  and measured from
   when?   I think its "when the addWaitTime calculation is run" or
   "lastSigExpirationTime - now"

   + Response: I think it's fairly self evident from the text that it's
     "when used", which is indeed at least during addWaitTime
     calculation.  But it's more conceptual and 'the amount of time
     remaining' I think is pretty clear to mean "from now".  I thought
     about adding something like "from now" into the sentence, but that
     didn't seem better to me and added unneeded or complexity to the
     sentence.  Suggestions?





6 WONTDO Delete 6.1.4 - activeRefreshOffset -  its a nonsensical value that is
==============================================================================

   only valid from the resolver's point of view.   For a given
   publisher/authoritative server - there will be as many
   activeRefreshOffsets as there are resolvers so the publisher must
   assume the worst case of activeRefresh.

   + Response: I disagree here.  This was put in after a discussion with
     Matthijs Mekking to specifically address the case where the polling
     period may not end up right in time-step with the addHoldDownTime.
     The end result is that when a RFC5011 resolver is polling at a
     different frequency based on the activeRefresh time.  We are
     defining a minimum mathematically defined safety value and this
     value is calculatable, so it should remain.  There is a hint of
     guidance text that says you can set it to activeRefresh for
     simplicity if you want.  But the math-line is before a full second
     activeRefresh.  But the value itself needs to be included to account
     for the odd frequency slippage.

     TL;DR: this isn't a guidance document, it's a mathematical line
     document

The problem is that the polling period is a resolver concept, not apublisher concept. And there are as many different polling intervals(all consisting of different start times but with lengths approximatelyequal to the activeRefresh interval) as there are resolvers.

Look. Publisher publishes at time A -.000000001 (or any time beforethis) where A is the last expiration time of the old records. ResolverFirst does an active query at A+.00000001, Resolver Last does a query atA + activeRefresh interval - .000000001. All other well behavedresolvers will have made a query at or after the first resolver and ator before the last resolver (with a safety slop added in for lostpackets and retransmits done under the fast query rate). ResolverLast will then (along with all the other resolvers) wait at leastaddHoldDown time and then make one last query. The relationship to whenthat last query is made to the resolver's perception of when theaddHoldDownTime expired can be different for each resolver, with thesole requirement that the resolver waited for the next query after theaddHoldDownTime expired.

A perfectly operating resolver with perfect clock and perfectconnectivity and no outages MIGHT possibly keep a perfect intervalbetween each query it makes (making your activeRefreshOffsetmeaningful), but 10000 resolvers ALL keeping perfect intervals? NWIH. So for the purposes of this we assume that ALL resolvers addHoldDownTimeexpired just after their last query meaning that we assume that ALLresolvers waited at least another queryInterval after the addHoldTimebefore making the query that triggers trust anchor installation.

the calculations represent is the worst case for any give resolver, butactually the overall best case numbers for the publisher. ThesafetyFactor deals with the clock and query drifts from the largepopulation of resolvers and moves the publisher slightly off their bestcase estimate.

So the total interval from after the last RRSig expiration of the oldstuff is activeRefresh (the latest query from all well behavedresolvers after the last expiration) + addHoldDownTime (.0000001 afterthis is when the first resolver installs the trust anchor) +activeRefresh (when the well behaved and connected resolvers haveinstalled the new trust anchors) + safetyFactor (when the guys whodropped a few packets installed during the process the trust anchors).

You got it wrong. It's not a matter of disagreement about whether thisis guidance vs math, its about choosing the parameters in a way that'smeaningful from the publisher's point of view. You picked data that isonly meaningful in the context of a single resolver, but which reducesto activeRefresh for a large population of resolvers.


Don't make it more complex than it needs to be.



7 WONTDO 6.2.1 - replace activeRefreshOffset with activeRefresh - worst case
============================================================================

   value. (Wait Timer Based Calculation)


8 WONTDO fix 6.2.1.1 delete the term for addHoldDownTime % activeRefresh - the
==============================================================================

   "2 * activeRefresh" in the previous term covers both the activeRefresh
   interval at the beginning of the acceptance period and the
   activeRefresh interval at the end.


9 WONTDO In 6.2.2 - same changes as for 6.2.1 and 6.2.1.1 (e.g. get rid of
==========================================================================

   activeRefreshOffset throughout).


   ,----
   |                         v          activeRefresh v
   | addHoldDownTime                 v    activeRefresh v  safetyMargin  v
   |
   | 
|-----------------------|-------------------------------------|-------------------------------------|--------------------------|----------------|
   |
   |  lastSigExpirationTime^^^                 acceptanceStarts ^^^
   | acceptance begins to complete^^            earliestSafe^^^
   `----

   After the second activeRefresh interval all of the well behaved and
   well connected resolvers should have the new trust anchor. The
   safetyMargin adds some space for poorly behaving or intermittently
   connected resolvers or those with some drops in queries.


10 DONE Section 6.3 has one too many activeRefresh terms in both formulas -
===========================================================================

   here are the corrected ones:

   remWaitTime = sigExpirationTimeRemaining + activeRefresh +
   safetyFactor

   remWallClockTime = lastSigExpirationTime + activeRefresh +
   safetyFactor

   Basically, assuming no attacker, and no drops, all well-behaved
   resolvers will see the revocation after one activeRefresh interval
   from the time of publication.  Add the safety factor to take care of
   the slackers.  This is a fine value for normal revocations where
   you're pretty sure that the key hasn't been compromised.

   There is no hold-time timer for revocation - they take effect
   immediately upon receipt and validation.


11 WONTDO Guidance about key compromise
=======================================

   In the case of a key compromise, I would suggest that the revoked key
   be published for the same interval as you would use for adding a new
   trust anchor.  (But of course, this won't actually matter all that
   much if you only have a single trust anchor....)

   + Good advice to put in an general advice document in the future

Yes and no. How you do revocation depends on why you are doingrevocation and what the downside of the revocation is if you don't getit right. In the one in/one out scheme that the root uses you can usethe shorter publication period. If you're dealing with a compromised keyyou need different guidance. Both of these are guidance for thepublisher based on different input conditions. Ignoring one over theother makes little sense. The math i gave you is correct for the inputconditions you specified. But the aren't the only possible inputconditions. This document started because you believed that thepublishers weren't making the right assumptions about publicationintervals and needed some guidance to explain what they needed to thinkabout. This seems to be the same class of misguidance. If a publisheruses this guidance for a compromised key they might (probably will) bemaking a mistake.


Later, Mike



12 DONE Appendix A - fix the calculations to match up with the section 6 
formulas.
==================================================================================


_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-08.txt

Reply via email to