Edward, this is OT but may I suggest you to use something like Wolfram Alpha to 
perform your calculations a bit more comfortably?

-- 
Enrico M. Crisostomo

On Jan 12, 2011, at 4:24, Edward Ned Harvey 
<opensolarisisdeadlongliveopensola...@nedharvey.com> wrote:

> For anyone who still cares:
> 
> I'm calculating the odds of a sha256 collision in an extremely large zpool,
> containing 2^35 blocks of data, and no repetitions.
> 
> The formula on wikipedia for the birthday problem is:
> p(n;d) ~= 1-( (d-1)/d )^( 0.5*n*(n-1) )
> 
> In this case, 
> n=2^35
> d=2^256
> 
> The problem is, this formula "does not compute" because n is so large.
> Fortunately x = e^ln(x) and so we're able to use this technique to make the
> huge exponent instead, a huge multiplication.
> 
> (Using the bc mathlib notation, the l(x) function is the natural log of x,
> and e(x) is e raised to the power of x)
> p(n;d) ~= 1-e(   (  0.5*n*(n-1)*l((d-1)/d)  )   )
> 
> Using bc to calculate the answer:
> bc -l
> 
> n=2^35
> d=2^256
> scale=1024
> 1-e(   (  0.5*n*(n-1)*l((d-1)/d)  )   )
> .0000000000000000000000000000000000000000000000000000000050978941154
> I manually truncated here (precision goes out 1024 places).  This is
> 5.1*10^-57
> 
> Note: I had to repeat the calculation many times, setting a larger and
> larger scale.  The default scale of 20, and even 64 and 70 and 80 were not
> precise enough to produce a convergent answer around the -57th decimal
> place.  So I just kept going larger, and in retrospect, anything over 100
> would have been fine.  I wrote 1024 above, so who cares.
> 
> If you've been paying close attention you'll recognize this is the same
> answer I originally calculated, but my original equation was in fact wrong.
> It just so happens that my original equation neglects the probability of
> collisions from previous events, so it is actually accurate whenever the
> probability of previous events is insignificant.  It is merely luck that for
> the data size in question, my equation produced something that looked
> correct.  It would produce a wrong calculation of probability for a larger
> value of n.
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to