On 11/20/2017 11:26 AM, Wes Hardaker wrote:
Michael StJohns <m...@nthpermutation.com> writes:
1 something.
I think that the consensus is clearly something like that. Are you
(MSJ) interested in supplying a suggested final equation for it?
Ok - after thinking about it, it turns out to be fairly simple.
1) Initially, ignore the outliers - the servers that are down and will
be down throughout the entire safety period. It's probable that most of
them were down during the original uptake period.
2) Assume a success rate of p per retry. I'm going to use .01 - or for
each retry period only 1 of 100 entities completes the last query.
3) Calculate Log.x(M) where M is the number of clients - arbitrarily
chosen at 10M and where x is (1/(1-p)) - the failure rate (or put
another way, the proportion of servers still waiting to complete after
the previous retry interval). Log.x(M) gives the number of intervals to
reduce the set of uncompleted servers to 0 assuming normal probability.
That gives you 1603 fast retry intervals. Setting p and M to different
values gets you a range of answers:
Number of Resolvers
10,000 100,000 1,000,000 10,000,000
100,000,000
Probability of Success Per Retry Interval 0.01 916.4212 1145.526
1374.632 1603.737 1832.84231
0.05 179.5623 224.4528 269.3434 314.23397
359.12454
0.1 87.41738 109.2717 131.1261 152.98042
174.834763
0.15 56.67242 70.84052 85.00862 99.176728
113.344832
0.25 32.01569 40.01961 48.02354 56.027459
64.0313822
0.5 13.28771 16.60964 19.93157 23.253497
26.5754248
0.9 4 5 6 7 8
(Think of it this way. Pretend you have 1000 resolvers and each has a
10% chance of completing in each interval. After the first interval,
900 are left. After the second 810, after the third...729 etc.
Ignoring rounding you need about 65 retries to get down to < 1 left
which is Log1.11111(1000).
This doesn't account for the servers who are offline, but see (1) above
for why its probably safe to ignore them.
So a publisher can pick an M and x (or p) that is their best guess from
the data they have and calculate:
safetyInterval ::= Log.x(M) * fastRetryInterval
Or simply make some worst case assumptions (.01 success rate, 10M
clients) and use a number from the table.
Mike
_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop