Re: [DNSOP] 5011-security-considerations and the safetyMargin

Michael StJohns Mon, 20 Nov 2017 11:48:15 -0800

On 11/20/2017 11:26 AM, Wes Hardaker wrote:

Michael StJohns <m...@nthpermutation.com> writes:

1 something.

I think that the consensus is clearly something like that.  Are you
(MSJ) interested in supplying a suggested final equation for it?


Ok - after thinking about it, it turns out to be fairly simple.

1) Initially, ignore the outliers - the servers that are down and willbe down throughout the entire safety period. It's probable that most ofthem were down during the original uptake period.

2) Assume a success rate of p per retry. I'm going to use .01 - or foreach retry period only 1 of 100 entities completes the last query.

3) Calculate Log.x(M) where M is the number of clients - arbitrarilychosen at 10M and where x is (1/(1-p)) - the failure rate (or putanother way, the proportion of servers still waiting to complete afterthe previous retry interval). Log.x(M) gives the number of intervals toreduce the set of uncompleted servers to 0 assuming normal probability.

That gives you 1603 fast retry intervals. Setting p and M to differentvalues gets you a range of answers:



        
        Number of Resolvers

                10,000  100,000         1,000,000       10,000,000      
100,000,000

Probability of Success Per Retry Interval 0.01 916.4212 1145.5261374.632 1603.737 1832.84231

0.05    179.5623        224.4528        269.3434        314.23397       
359.12454
0.1     87.41738        109.2717        131.1261        152.98042       
174.834763
0.15    56.67242        70.84052        85.00862        99.176728       
113.344832
0.25    32.01569        40.01961        48.02354        56.027459       
64.0313822
0.5     13.28771        16.60964        19.93157        23.253497       
26.5754248
0.9     4       5       6       7       8

(Think of it this way. Pretend you have 1000 resolvers and each has a10% chance of completing in each interval. After the first interval,900 are left. After the second 810, after the third...729 etc. Ignoring rounding you need about 65 retries to get down to < 1 leftwhich is Log1.11111(1000).

This doesn't account for the servers who are offline, but see (1) abovefor why its probably safe to ignore them.

So a publisher can pick an M and x (or p) that is their best guess fromthe data they have and calculate:


safetyInterval ::=  Log.x(M) * fastRetryInterval

Or simply make some worst case assumptions (.01 success rate, 10Mclients) and use a number from the table.



Mike

_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Re: [DNSOP] 5011-security-considerations and the safetyMargin

Reply via email to