Re: Collecting entropy from device_attach() times.

Mariusz Gromada Mon, 24 Sep 2012 15:05:28 -0700

W dniu 2012-09-23 17:17, Pawel Jakub Dawidek pisze:

On Sun, Sep 23, 2012 at 02:37:48AM +0200, Mariusz Gromada wrote:

W dniu 2012-09-22 21:53, Pawel Jakub Dawidek pisze:

Mariusz, can you confirm my findings?


Pawel,

Your conclusions can be easily confirmed by shape analysis of the EDF.
Usually maximum quantile difference (called D-statistic) gives you a
kind of overview, function shape gives you a strong feeling, p-value
gives you a formal proof.
D-statistic values (your data):

   6bit:   0.33%
   7bit:   0.29%
   8bit:   0.27%
   9bit:   0.21%
10bit:   6.34%
11bit:  19.07%
12bit:  54.80%

What I would say: increasing the number of bits from 6 to 9 does not
affect distribution "uniformity", reaching the tenth bit results in
sudden increase in the difference measure -  the more bits, the more
difference is observed. Distribution shape analysis for the 10th bit
shows non-linear function. Lack of "randomness" in the quntile
difference curve - chart  shows completely lack of noise (pure
functional relation).  These are very strong indicators that starting
from 10th bit distribution was changed and is no longer uniform.

To formally confirm above conclusion for i.e. 5% significance level,
which means that confidence level is 95%, I need some extra data
regarding sample sizes. Please pass to me number of collected
observations in each 6-12 bit experiment.


Total number of observations was 162833.

Ok, finally I have some formal results. To be completely honest I needto point out that, in fact, we have a discrete data (for exampleintegers 0, 1, ..., 63, but not continues numbers spread across 0 and63). That is way I am going to use two sample Kolmogorov-Smirnov test.Methodology is simple:


- Pawel’s data will be called empirical one

- Theoretical data will be generated as a sequence of unique integernumbers from 0 to 2**n -1, where n is the number of bits. Assumption -each number appears in theoretical data only once representing idealuniform distribution.


Calculations will be done in the R-cran package

Loading empirical data form files:

> e6 = read.table("E:\\pawel\\dhr2_6bit_sorted.txt")
> e7 = read.table("E:\\pawel\\dhr2_7bit_sorted.txt")
> e8 = read.table("E:\\pawel\\dhr2_8bit_sorted.txt")
> e9 = read.table("E:\\pawel\\dhr2_9bit_sorted.txt")
> e10 = read.table("E:\\pawel\\dhr2_10bit_sorted.txt")
> e11 = read.table("E:\\pawel\\dhr2_11bit_sorted.txt")
> e12 = read.table("E:\\pawel\\dhr2_12bit_sorted.txt")

Generating ideal theoretical data:

> t6 = c(0:(2**6-1))
> t7 = c(0:(2**7-1))
> t8 = c(0:(2**8-1))
> t9 = c(0:(2**9-1))
> t10 = c(0:(2**10-1))
> t11 = c(0:(2**11-1))
> t12 = c(0:(2**12-1))

Performing KS tests:

> ks.test(e6, t6)
D = 0.0032, p-value = 1

> ks.test(e7, t7)
D = 0.0029, p-value = 1

> ks.test(e8, t8)
D = 0.0027, p-value = 1

> ks.test(e9, t9)
D = 0.0022, p-value = 1

> ks.test(e10, t10)
D = 0.0634, p-value = 0.0005562

> ks.test(e11, t11)
D = 0.1907, p-value < 2.2e-16

> ks.test(e12, t12)
D = 0.5479, p-value < 2.2e-16

As you can see D-statistics are almost the same as calculated by Pawel(considering roundings). P-values are very interesting due to very highnumber of observations generated by Pawel. Between 6 bits and 9 bitsestimated p-values are equal to 1, so it means that it is impossible (atany significance level) to reject null hypothesis stating that compareddistributions are equal. Final conclusion: it has to be random, and forsure it is random!

Additionally starting form 10 bits we can observe dramatic decrease ofp-value (from 100% to c.a. 0,06% and much less for the 11-12 bits). Solow p-value means that it is impossible not to reject null hypothesisstating that compared distributions are equal. Final conclusion: itcannot be random, and for sure it is not random.

I did the same comparison for the previous real device attach data (2081obs.). R code and the results are below:


> e16 = read.table("E:\\pawel\\device_attach_16bit.log")
> t16 = c(0:(2**16-1))
> ks.test(e16, t16)
D = 0.0178, p-value = 0.5422

Again, D-statistic an p-value are almost the same as previouslycalculated "manually". P-value is very high (it is not as high as in the6-12 bits tests, but consider much lower number of observations: 2081 vs162833), giving almost sureness that you have captured real 16-bitsentropy!


Regards,
Mariusz

_______________________________________________
freebsd-security@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-security
To unsubscribe, send any mail to "freebsd-security-unsubscr...@freebsd.org"

Re: Collecting entropy from device_attach() times.

Reply via email to