Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-08.txt

Michael StJohns Mon, 11 Dec 2017 18:15:00 -0800

On 12/11/2017 8:03 PM, Wes Hardaker wrote:

Michael StJohns <m...@nthpermutation.com> writes:


Hi Mike,

Thanks for explaining your thinking because I think, after reading it:
we're actually in agreement but using different terms for where to put
in the slop you're worried about.

Specifically:

A perfectly operating resolver with perfect clock and perfect
connectivity and no outages MIGHT possibly keep a perfect interval
between each query it makes (making your activeRefreshOffset
meaningful), but 10000 resolvers ALL keeping perfect intervals?

Yes, I agree.  But, this is why I want the majority of the equation to
be defining the mathematical perfect certainty.  And then *after* that,
add the operational slop factor (safetyMargin) to account for both
problems and reality (you forgot to add "speed of light issues" in your
text above, for example ).

(sigh - safety factor deals with speed of light issues....DUH)


No, no, no, no, no.

Thus, I break the equation into two critical parts:

addWallClockTime = lastSigExpirationTime
                    + addHoldDownTime
                    + activeRefresh                   ^
                    + activeRefreshOffset             |
                                                      |
Precise Math                                         |
-----------------------------------------------------|
Needed Fuzz                                          |
                                                      |
                    + safetyMargin                    |
                                                      v

If you'd reorder this properly, you can probably get the right answer - For the first part of the discussion assume no drifts or retransmitsbetween (1) and (2) and between (3) and (4).

1) T == lastSigExpirationTime and microscopically after this the timethat the server sees the first query from any resolver starting theirtrust anchor installation.2) T + activeRefresh is the time at which the server sees the lastquery from the last resolver just starting their trust anchor installation.3) T + activeRefresh + addHoldDownTime is the time at which the serversees the first query from any resolver finalizing its trust anchorinstallation.4) T + activeRefresh + addHoldDownTime + activeRefresh is the time atwhich the server sees the last query from the last resolver finalizingits trust anchor installation.

(1) is the earliest time any resolver can start its installation(assuming an attack) because its also the time when all of the oldsignatures expire.(2) is the time at which a resolver who had its last activeRefresh justbefore T (and because of that wasn't able to start its installation)will send its first installation query.Between (2) and (3) any given resolver may drift/retransmit with theresult that any given resolver may end up making a query just before (3)placing its next and final query at (3) plus activeRefresh.

Finally, to deal with drift and retransmits between (1) and (2), andbetween (3) and (4) we add a safetyFactor. That deals with about99.9999% of drift and retransmits but will never deal with the serversthat have been offline or otherwise unable to get their queriescompleted. The retransmits in a given clients addHoldDown period onlyreally move the end point for a given resolver and don't affect theoverall safetyFactor of the set of resolvers.


IE, if a perfect resolver hitting a RFC5011 zone with an activeRefresh
that evenly divides into 30 days:

   1) queries at T--- = lastSigExpirationTime - .000001
   2) queries at T+1--- = lastSigExpirationTime - .000001 + activeRefresh

Yes.

   3) Notes that it just saw a new key (assuming worst case #1 is replayed)
   4) starts timer
   5) will query again at lastSigExpirationTime + 30 days - .000001

No - from the servers point of view, the worst client (which is the onlyone the server cares about) will make its last query before trust anchorinstallation at lastSigExpirationTime + activeRefresh (when the lastCLIENT saw its first valid update) + 30 days -.0000001.

   6) notes this is still in waiting period
   7) will query again at lastSigExpirationTime + 30 days - .000001 + 
activeRefresh

Nope. The worst client will query again at (from the servers point ofview) lastSigExpiration + activeRefresh + addHoldDown (30) + activeRefresh

From a given client's point of view the last query can happen anywherefrom (lastSigExpiration + addHoldDown + .00000001) to(lastSigExpriration + activeRefresh + addHoldDown + activeRefresh). The server only cares about the worst (latest) case.

   8) now notes that it's been 30 days and accepts key

There is only 1 activeRefresh in that sequence.  And that's what's in
the equation.  Because the timing distance between #7 and #2 is still 30
days when queried to the perfect sub-nano second.


Nope.  Not from the servers point of view.


Then there should be a bunch of delays inserted, network timeouts, etc.
That's where the safetyMargin should come in and catch all the issues
with the impreciseness of the real world.  Now, if you want to add an
activeRefresh to the already defined safetyMargin suggested term, I'm
willing to consider that.  But it shouldn't be listed as part of
anything but the slop term for security analysis clarity.

Would you like to add more time to the safetyMargin to deal with the
non-perfect world, including clock drift because of time delays in a
bunch of queries back to back or any other reason?


Ending note about the precise timeline: when 30 days isn't divisible by
the activeRefresh, then you need to add the other term we haven't talked
about much which is the activeRefreshOffset which accounts for this
case.

And again. NO. The retransmits over a given set of clients in theaddHoldDown period will result in at least one client (the "worst"client) ending up making a query just before the expiration of ITSaddHoldDown timer. Assuming the worst case of at least one clientmaking a query just before the lastSigExpirationTime and that sameclient drifting/retransmitting enough to make a query just before itsaddHoldDown time the activeRefreshOffset is a useless value to calculate.


Later, Mike


Cheers,
Wes



_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-08.txt

Reply via email to