Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-08.txt

2017-12-07 Thread Wes Hardaker
Michael StJohns  writes:

> Much improved - but still some disconnects (all review is de novo):

That's Mike.  All good comments.  I've attached responses and actions
(or inactions) below and will push a new version shortly as well.

Wes Hardaker


Table of Contents
_

1 DONE In Abstract - insert "by the publisher" after "must be followed" -
2 DONE Section 2 - first para - "from the DNSKEY publication and
3 DONE in 3 - lastSigExpirationTime - replace the first sentence  with "The
4 TODO in 3 - sigExpirationTimeRemaining - two items:
5 DONE in 5.1.1 T+10 - replace "they have now expired" with "the signatures
6 WONTDO Delete 6.1.4 - activeRefreshOffset -  its a nonsensical value that is
7 WONTDO 6.2.1 - replace activeRefreshOffset with activeRefresh - worst case
8 WONTDO fix 6.2.1.1 delete the term for addHoldDownTime % activeRefresh - the
9 WONTDO In 6.2.2 - same changes as for 6.2.1 and 6.2.1.1 (e.g. get rid of
10 DONE Section 6.3 has one too many activeRefresh terms in both formulas -
11 WONTDO Guidance about key compromise
12 DONE Appendix A - fix the calculations to match up with the section 6 
formulas.


1 DONE In Abstract - insert "by the publisher" after "must be followed" -
=

  this is clear later, but should be clear in the abstract.


2 DONE Section 2 - first para - "from the DNSKEY publication and


  revocation's point of view" is unusual phrasing.  I'm not sure how a
  publication or revocation has a point of view.  I think you meant from
  the trust anchor publisher or SEP DNSKEY publisher's point of view?

  + Response: I did indeed mean SEP publisher; fixed


3 DONE in 3 - lastSigExpirationTime - replace the first sentence  with "The
===

  latest value (i.e. the future most date and time) of any RRSig
  Signature Expiration field  covering any DNSKEY RRSet containing only
  the old trust anchor(s) that are being superseded." - This may still
  need wordsmithing or expansion.


4 TODO in 3 - sigExpirationTimeRemaining - two items:
=

  "latestSigExpirationTime" -> lastSigExpirationTime.  and measured from
  when?   I think its "when the addWaitTime calculation is run" or
  "lastSigExpirationTime - now"

  + Response: I think it's fairly self evident from the text that it's
"when used", which is indeed at least during addWaitTime
calculation.  But it's more conceptual and 'the amount of time
remaining' I think is pretty clear to mean "from now".  I thought
about adding something like "from now" into the sentence, but that
didn't seem better to me and added unneeded or complexity to the
sentence.  Suggestions?


5 DONE in 5.1.1 T+10 - replace "they have now expired" with "the signatures
===

  have now expired" -  clarify context.


6 WONTDO Delete 6.1.4 - activeRefreshOffset -  its a nonsensical value that is
==

  only valid from the resolver's point of view.   For a given
  publisher/authoritative server - there will be as many
  activeRefreshOffsets as there are resolvers so the publisher must
  assume the worst case of activeRefresh.

  + Response: I disagree here.  This was put in after a discussion with
Matthijs Mekking to specifically address the case where the polling
period may not end up right in time-step with the addHoldDownTime.
The end result is that when a RFC5011 resolver is polling at a
different frequency based on the activeRefresh time.  We are
defining a minimum mathematically defined safety value and this
value is calculatable, so it should remain.  There is a hint of
guidance text that says you can set it to activeRefresh for
simplicity if you want.  But the math-line is before a full second
activeRefresh.  But the value itself needs to be included to account
for the odd frequency slippage.

TL;DR: this isn't a guidance document, it's a mathematical line
document


7 WONTDO 6.2.1 - replace activeRefreshOffset with activeRefresh - worst case


  value. (Wait Timer Based Calculation)


8 WONTDO fix 6.2.1.1 delete the term for addHoldDownTime % activeRefresh - the
==

  "2 * activeRefresh" in the previous term covers both the activeRefresh
  interval at the beginning of the acceptance period and the
  activeRefresh interval at the end.


9 WONTDO In 6.2.2 - same changes as for 6.2.1 and 6.2.1.1 (e.g. get rid of
==

  activeRefreshOffset throughout).


  ,
  |   

Re: [DNSOP] Comments on mic comments, 5011 update's authorship

2017-12-07 Thread Wes Hardaker
Edward Lewis  writes:

Ed,

Sorry for the delay in a response.  Too many recent deadlines and
vacations...

> It seems that there is an impression that I feel the authors of the
> 5011-update draft are wrong choice to be documenting this.  This is
> not meant to be a personal attack on the authors but a blanket comment
> on seeing operator-focused documents being produced without operator
> involvement.  (Apologies if it is thought to be an ad hominum
> "attack".)

I do understand that it wasn't anything personal.

> It isn't that Wes and Warren aren't qualified to write the document.
> I'm commenting on the legacy of documents written by protocol
> designers that are passed off as operations guidance.

I think this is where the biggest misconception may lie about the
purpose of our document.  The document is structured as a mathematically
defined security line that you MUST NOT cross, not as operational
guidance.  We even state so multiple times in the document and I do hope
that a future document (authored by someone else) comes out as a BCP or
informational document that truly does give good advice, from a
publishers point of view, about the best way to use RFC5011 and
suggested timing mechanisms for key-rolling things like the root and
other domains.

This document is a security analysis result, however, and it may be that
you might think this was actually the wrong group to submit it through?  

[good story about operators not reading RFCs...]

> Since then I wondered what could be done to improve the usefulness of
> RFCs to operators and why I have begun to think of "return on
> investment" of documents.

I sure wish we had a better answer to this problem, as it's been
plaguing the O&M section of the IETF for decades (forever?).
Unfortunately, I suspect that there isn't nearly enough "real operator
content" here (IETF) to attract the attention of operators.  It still
looks and feels and smells like a protocol engineer camp, and if you're
an operator and have the choice of spending time and a travel budget
toward an IETF or toward a *OG/RIPE, it's much more likely you'll head
toward the dedicated operational camps.  I'm not sure that means we
shouldn't produce work out of the O&M area though, as we have a lot of
people from operator companies that are here as proxies at least.
-- 
Wes Hardaker
USC/ISI

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


[DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-09.txt

2017-12-07 Thread internet-drafts

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the Domain Name System Operations WG of the IETF.

Title   : Security Considerations for RFC5011 Publishers
Authors : Wes Hardaker
  Warren Kumari
Filename: 
draft-ietf-dnsop-rfc5011-security-considerations-09.txt
Pages   : 19
Date: 2017-12-07

Abstract:
   This document extends the RFC5011 rollover strategy with timing
   advice that must be followed by the publisher in order to maintain
   security.  Specifically, this document describes the math behind the
   minimum time-length that a DNS zone publisher must wait before
   signing exclusively with recently added DNSKEYs.  This document also
   describes the minimum time-length that a DNS zone publisher must wait
   after publishing a revoked DNSKEY before assuming that all active
   RFC5011 resolvers should have seen the revocation-marked key and
   removed it from their list of trust anchors.

   This document contains much math and complicated equations, but the
   summary is that the key rollover / revocation time is much longer
   than intuition would suggest.  If you are not both publishing a
   DNSSEC DNSKEY, and using RFC5011 to advertise this DNSKEY as a new
   Secure Entry Point key for use as a trust anchor, you probably don't
   need to read this document.


The IETF datatracker status page for this draft is:
https://datatracker.ietf.org/doc/draft-ietf-dnsop-rfc5011-security-considerations/

There are also htmlized versions available at:
https://tools.ietf.org/html/draft-ietf-dnsop-rfc5011-security-considerations-09
https://datatracker.ietf.org/doc/html/draft-ietf-dnsop-rfc5011-security-considerations-09

A diff from the previous version is available at:
https://www.ietf.org/rfcdiff?url2=draft-ietf-dnsop-rfc5011-security-considerations-09


Please note that it may take a couple of minutes from the time of submission
until the htmlized version and diff are available at tools.ietf.org.

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-08.txt

2017-12-07 Thread Michael StJohns

On 12/7/2017 7:53 PM, Wes Hardaker wrote:

Michael StJohns  writes:


Much improved - but still some disconnects (all review is de novo):

That's Mike.  All good comments.  I've attached responses and actions
(or inactions) below and will push a new version shortly as well.

Wes Hardaker


Table of Contents
_


4 TODO in 3 - sigExpirationTimeRemaining - two items:
=

   "latestSigExpirationTime" -> lastSigExpirationTime.  and measured from
   when?   I think its "when the addWaitTime calculation is run" or
   "lastSigExpirationTime - now"

   + Response: I think it's fairly self evident from the text that it's
 "when used", which is indeed at least during addWaitTime
 calculation.  But it's more conceptual and 'the amount of time
 remaining' I think is pretty clear to mean "from now".  I thought
 about adding something like "from now" into the sentence, but that
 didn't seem better to me and added unneeded or complexity to the
 sentence.  Suggestions?





6 WONTDO Delete 6.1.4 - activeRefreshOffset -  its a nonsensical value that is
==

   only valid from the resolver's point of view.   For a given
   publisher/authoritative server - there will be as many
   activeRefreshOffsets as there are resolvers so the publisher must
   assume the worst case of activeRefresh.

   + Response: I disagree here.  This was put in after a discussion with
 Matthijs Mekking to specifically address the case where the polling
 period may not end up right in time-step with the addHoldDownTime.
 The end result is that when a RFC5011 resolver is polling at a
 different frequency based on the activeRefresh time.  We are
 defining a minimum mathematically defined safety value and this
 value is calculatable, so it should remain.  There is a hint of
 guidance text that says you can set it to activeRefresh for
 simplicity if you want.  But the math-line is before a full second
 activeRefresh.  But the value itself needs to be included to account
 for the odd frequency slippage.

 TL;DR: this isn't a guidance document, it's a mathematical line
 document



  The problem is that the polling period is a resolver concept, not a 
publisher concept.   And there are as many different polling intervals 
(all consisting of different start times but with lengths approximately 
equal to the activeRefresh interval) as there are resolvers.


Look.   Publisher publishes at time A -.1 (or any time before 
this) where A is the last expiration time of the old records.  Resolver 
First does an active query at A+.0001, Resolver Last does a query at 
A + activeRefresh interval - .1.  All other well behaved 
resolvers will have made a query at or after the first resolver and at 
or before the last resolver (with a safety slop added in for lost 
packets and retransmits done under the fast query rate).    Resolver 
Last will then (along with all the other resolvers) wait at least 
addHoldDown time and then make one last query.  The relationship to when 
that last query is made to the resolver's perception of when the 
addHoldDownTime expired  can be different for each resolver, with the 
sole requirement that the resolver waited for the next query after the 
addHoldDownTime expired.


A perfectly operating resolver with perfect clock and perfect 
connectivity and no outages MIGHT possibly keep a perfect interval 
between each query it makes (making your activeRefreshOffset 
meaningful), but 1 resolvers ALL keeping perfect intervals? NWIH.   
So for the purposes of this we assume that ALL resolvers addHoldDownTime 
expired just after their last query meaning that we assume that ALL 
resolvers waited at least another queryInterval after the addHoldTime 
before making the query that triggers trust anchor installation.


the calculations represent is the worst case for any give resolver, but 
actually the overall best case numbers for the publisher.   The 
safetyFactor deals with the clock and query drifts from the large 
population of resolvers and moves the publisher slightly off their best 
case estimate.


So the total interval from after the last RRSig expiration of the old 
stuff is  activeRefresh (the latest query from all well behaved 
resolvers after the last expiration) + addHoldDownTime (.001 after 
this is when the first resolver installs the trust anchor) + 
activeRefresh (when the well behaved and connected resolvers have 
installed the new trust anchors) + safetyFactor (when the guys who 
dropped a few packets installed during the process the trust anchors).


You got it wrong.  It's not a matter of disagreement about whether this 
is guidance  vs math, its about choosing the parameters in a way that's 
meaningful from the publisher's point of view.  You picked data that is 
only meaningful in the contex

Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-08.txt

2017-12-07 Thread Michael StJohns
To try this out, let’s use a ttl of 28 hours and an expiration of 7 days to
get an active refresh as below.

Take an activeRefresh of 14 hours (giving a fast retry of 2.8 hours and an
addHoldDown time of 30 days (720 hour). That gives you an
activeRefreshOffset of 6 hours.   A perfect resolver will make 51 queries
in 714 hours and the last triggering query at 728 hours.

Assume a 4% failure rate (to make the math easy) in queries and assume the
first retry always works. Which adds approx two fast retry intervals to the
time making the total time to do 51 queries 719.8 hour for an actual offset
of .2 hours.  The next and last query needed to take place to trigger the
trust anchor installation will take place at 733.8 hours.

The point is that retransmission drift makes the whole concept of
activeRefreshOffset nonsensical in any population of resolvers with any
losses at all.

Mike
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop