Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-08.txt

Michael StJohns Sat, 16 Dec 2017 10:42:07 -0800

Hi Wes -

Interesting approach, but characterizing a single resolver isn'tuseful. You need to characterize the entire set of resolvers doing thequery.

Also, you're still missing the fact that a given resolver can start itsaddHoldDown timer anywhere in the range of [0..activeRefresh] AFTER thesignature expiration depending on when it was starting its clock. Allof your calculations are starting from the wrong point.

You're also continuing to miss the point that a given resolver makes itslast query (assuming no retransmits) anywhere in the range of[0..activeRefresh] after the addHoldDown timer expires.

Your calculations represent the best case [e.g. triggers at 0 for bothends of the problem] when what you want is worst case.


Below you've calculated activeRefresh as 43200

Timestamp

        Human Time      Note
0       0d 0h 0m 0s     Resolver Queries
*New DNSKEY Publication*
604800  7d 0h 0m 0s     Resolver Queries
*sigExpirationTime = Original DNSKEY RRSIG Expires*
604800  7d 0h 0m 0s     DNSKEY First seen

648000 7d 12h 0m 0s Latest time DNSKEY first seen. (sigExpirationTime +activeRefresh)

3196800         37d 0h 0m 0s    Resolver Queries
*sigExpirationTime + addHoldDownTimer*

**

3196800         37d 0h 0m 0s    DNSKEY Accepted

*3240000 37d 12h - Add Hold down expired for last clientsigExpirationTime + activeRefresh + addHoldDownTimer (without retransmits)

3240000         37d 12h 0m 0s   Resolver Queries
*sigExpirationTime + addHoldDownTimer + activeRefresh*
3240000 37d 12h 0m 0s *sigExpirationTime + addHoldDownTimer +activeRefresh + activeRefreshOffset*

*3283200 38d 0m 0s DNSKey accepted by last client sigExpirationTime +activeRefresh + addHoldDownTimer (without retransmits)*

3326400         38d 12h 0m 0s   Resolver Queries
*sigExpirationTime + addHoldDownTimer + activeRefresh + driftSafety*
3326400 38d 12h 0m 0s *sigExpirationTime + addHoldDownTimer +activeRefresh + activeRefreshOffset + driftSafety*
3412800         39d 12h 0m 0s   Resolver Queries
*sigExpirationTime + addHoldDownTimer + activeRefresh + driftSafety +retrySafety*

For the rest - drift safety is probably no more 10seconds per query -call it for this formula about 620 seconds for the worst case (notassuming retransmit) and could mostly be ignored - the worst case isgoing to be less than a fastQueryInterval in all but pathologicalcases. Basically 10s * (2 * activeRefresh + addHoldDown)/activeRefresh.

RetrySafety needs to be calculated on the set of clients as we'relooking for the worst case of all of the clients (rightmost point of thenormal distribution curve).

I'd suggest redoing this as a simulation. I used the two sets of params(30 days and 24 hours vs 30 days and 28 hours) (sig expire and ttl)plus .05 failure plus 25000 clients and ran multiple trials for each. The values in my table represent - for each trial - the latest time aclient got to that point. For the last four columns, those valuesrepresent the number of times a client exceeded the calculated safeinterval (and divide by the total number of clients to get apercentage...).


Later, Mike


On 12/15/2017 6:55 PM, Wes Hardaker wrote:

Michael StJohns <m...@nthpermutation.com> writes:

Below is a java program I wrote to model this stuff.  In the table,
SF2 represents the number of clients that blew past twice the safety
factor (for aR+aHD+aR), SF1 represents the number of clients that blew
past the single safety factor.  OF is the number of clients using the
activeRefreshOffset calculation that finished after the calculated
interval (e.g. aR+aHD+aRO).  OF+s is the number of clients that
finished after the activeRefreshOffset + safetyFactor (in the first
table these are the same because of perfect responses).   In the
second table, compare SF1 to OF+s - SF1 < OF+s suggesting that
activeRefresh is a better choice that activeRefreshQuery for the third
term of the equation.  You can try a lot of different combinations,
but I haven't found any case where OF+s performs better that SF1.

The difference between lastStart and lAddHoldBegin represents the
retransmits after the first query.  The differences between
lAddHoldEnd and lFinalQuery represent retransmits after the last
normal query before the end of the add hold down time until a valid
answer was received after the addHoldDown time expired.

Feel free to twiddle with this.

Work bogged me down to able to write anything back so far.  Thanks for
the java code; I'll respond with the java*script* code I've been hacking
up at the same time:

https://www.isi.edu/~hardaker/projects/5011/


I didn't add the re-transmit time issue that your code takes into
account, but I did add a query drift that nicely shows one of your
concerns.  In particular, with various values of query drift (including
-1) you can reproduce the real world situation that you're worried
about, which is (as I've mentioned) an important one to call out.

_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Re: [DNSOP] I-D Action: draft-ietf-dnsop-rfc5011-security-considerations-08.txt

Reply via email to