On 12/7/2017 7:53 PM, Wes Hardaker wrote:
Michael StJohns <m...@nthpermutation.com> writes:
Much improved - but still some disconnects (all review is de novo):
That's Mike. All good comments. I've attached responses and actions
(or inactions) below and will push a new version shortly as well.
Wes Hardaker
Table of Contents
_________________
4 TODO in 3 - sigExpirationTimeRemaining - two items:
=====================================================
"latestSigExpirationTime" -> lastSigExpirationTime. and measured from
when? I think its "when the addWaitTime calculation is run" or
"lastSigExpirationTime - now"
+ Response: I think it's fairly self evident from the text that it's
"when used", which is indeed at least during addWaitTime
calculation. But it's more conceptual and 'the amount of time
remaining' I think is pretty clear to mean "from now". I thought
about adding something like "from now" into the sentence, but that
didn't seem better to me and added unneeded or complexity to the
sentence. Suggestions?
6 WONTDO Delete 6.1.4 - activeRefreshOffset - its a nonsensical value that is
==============================================================================
only valid from the resolver's point of view. For a given
publisher/authoritative server - there will be as many
activeRefreshOffsets as there are resolvers so the publisher must
assume the worst case of activeRefresh.
+ Response: I disagree here. This was put in after a discussion with
Matthijs Mekking to specifically address the case where the polling
period may not end up right in time-step with the addHoldDownTime.
The end result is that when a RFC5011 resolver is polling at a
different frequency based on the activeRefresh time. We are
defining a minimum mathematically defined safety value and this
value is calculatable, so it should remain. There is a hint of
guidance text that says you can set it to activeRefresh for
simplicity if you want. But the math-line is before a full second
activeRefresh. But the value itself needs to be included to account
for the odd frequency slippage.
TL;DR: this isn't a guidance document, it's a mathematical line
document
The problem is that the polling period is a resolver concept, not a
publisher concept. And there are as many different polling intervals
(all consisting of different start times but with lengths approximately
equal to the activeRefresh interval) as there are resolvers.
Look. Publisher publishes at time A -.000000001 (or any time before
this) where A is the last expiration time of the old records. Resolver
First does an active query at A+.00000001, Resolver Last does a query at
A + activeRefresh interval - .000000001. All other well behaved
resolvers will have made a query at or after the first resolver and at
or before the last resolver (with a safety slop added in for lost
packets and retransmits done under the fast query rate). Resolver
Last will then (along with all the other resolvers) wait at least
addHoldDown time and then make one last query. The relationship to when
that last query is made to the resolver's perception of when the
addHoldDownTime expired can be different for each resolver, with the
sole requirement that the resolver waited for the next query after the
addHoldDownTime expired.
A perfectly operating resolver with perfect clock and perfect
connectivity and no outages MIGHT possibly keep a perfect interval
between each query it makes (making your activeRefreshOffset
meaningful), but 10000 resolvers ALL keeping perfect intervals? NWIH.
So for the purposes of this we assume that ALL resolvers addHoldDownTime
expired just after their last query meaning that we assume that ALL
resolvers waited at least another queryInterval after the addHoldTime
before making the query that triggers trust anchor installation.
the calculations represent is the worst case for any give resolver, but
actually the overall best case numbers for the publisher. The
safetyFactor deals with the clock and query drifts from the large
population of resolvers and moves the publisher slightly off their best
case estimate.
So the total interval from after the last RRSig expiration of the old
stuff is activeRefresh (the latest query from all well behaved
resolvers after the last expiration) + addHoldDownTime (.0000001 after
this is when the first resolver installs the trust anchor) +
activeRefresh (when the well behaved and connected resolvers have
installed the new trust anchors) + safetyFactor (when the guys who
dropped a few packets installed during the process the trust anchors).
You got it wrong. It's not a matter of disagreement about whether this
is guidance vs math, its about choosing the parameters in a way that's
meaningful from the publisher's point of view. You picked data that is
only meaningful in the context of a single resolver, but which reduces
to activeRefresh for a large population of resolvers.
Don't make it more complex than it needs to be.
7 WONTDO 6.2.1 - replace activeRefreshOffset with activeRefresh - worst case
============================================================================
value. (Wait Timer Based Calculation)
8 WONTDO fix 6.2.1.1 delete the term for addHoldDownTime % activeRefresh - the
==============================================================================
"2 * activeRefresh" in the previous term covers both the activeRefresh
interval at the beginning of the acceptance period and the
activeRefresh interval at the end.
9 WONTDO In 6.2.2 - same changes as for 6.2.1 and 6.2.1.1 (e.g. get rid of
==========================================================================
activeRefreshOffset throughout).
,----
| v activeRefresh v
| addHoldDownTime v activeRefresh v safetyMargin v
|
|
|-----------------------|-------------------------------------|-------------------------------------|--------------------------|----------------|
|
| lastSigExpirationTime^^^ acceptanceStarts ^^^
| acceptance begins to complete^^ earliestSafe^^^
`----
After the second activeRefresh interval all of the well behaved and
well connected resolvers should have the new trust anchor. The
safetyMargin adds some space for poorly behaving or intermittently
connected resolvers or those with some drops in queries.
10 DONE Section 6.3 has one too many activeRefresh terms in both formulas -
===========================================================================
here are the corrected ones:
remWaitTime = sigExpirationTimeRemaining + activeRefresh +
safetyFactor
remWallClockTime = lastSigExpirationTime + activeRefresh +
safetyFactor
Basically, assuming no attacker, and no drops, all well-behaved
resolvers will see the revocation after one activeRefresh interval
from the time of publication. Add the safety factor to take care of
the slackers. This is a fine value for normal revocations where
you're pretty sure that the key hasn't been compromised.
There is no hold-time timer for revocation - they take effect
immediately upon receipt and validation.
11 WONTDO Guidance about key compromise
=======================================
In the case of a key compromise, I would suggest that the revoked key
be published for the same interval as you would use for adding a new
trust anchor. (But of course, this won't actually matter all that
much if you only have a single trust anchor....)
+ Good advice to put in an general advice document in the future
Yes and no. How you do revocation depends on why you are doing
revocation and what the downside of the revocation is if you don't get
it right. In the one in/one out scheme that the root uses you can use
the shorter publication period. If you're dealing with a compromised key
you need different guidance. Both of these are guidance for the
publisher based on different input conditions. Ignoring one over the
other makes little sense. The math i gave you is correct for the input
conditions you specified. But the aren't the only possible input
conditions. This document started because you believed that the
publishers weren't making the right assumptions about publication
intervals and needed some guidance to explain what they needed to think
about. This seems to be the same class of misguidance. If a publisher
uses this guidance for a compromised key they might (probably will) be
making a mistake.
Later, Mike
12 DONE Appendix A - fix the calculations to match up with the section 6
formulas.
==================================================================================
_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop