On Wed, Feb 02, 2011 at 06:01:36PM -0500, Carl Witthoft wrote: > Hi, subject more or less says it all. > > I freely admit to not having bothered to find some of the online papers > about method of testing the quality of random number generators -- but > in an idle moment I wondered what to expect from something like the > following: > > > randa<-runif(1000) > randb<-runif(1000) > t.test(randa,randb)$p.value > var.test(randa,randb)$p.value > > [repeat ad nauseum] > > > Is the range of p-values I get in any way related tothe "quality" of the > random number generator?
Hi. As already explained, the result of t.test() in this case confirms good quality of Mersenne Twister generator used in R. The situation is slightly more complicated with ks.test() due to the 32-bit precision of the random numbers as discussed in section Note of ?RNGkind. For example n <- 100000 ks.test(runif(n), runif(n)) typically produces a warning due to ties. This is not related to the quality of the randomness. The reason is that the random numbers have 32 bits and due to birthday paradox we get collisions already for 2^16 numbers with probability about 0.39. The null hypothesis should be changed to assume uniform distribution on the numbers in (0, 1) with at most 32 bits. See section Random Number Generators of CRAN Task View Probability Distributions by Christophe Dutang for information on CRAN packages related to random numbers. As far as i know, the only tests, which can distinguish Mersenne Twister numbers from truly random ones are linear complexity tests mod 2. This is discussed, for example, in section 7 Conclusion, Future Work, and Open Issues in http://www.iro.umontreal.ca/~lecuyer/myftp/papers/horms.pdf by P. L'Ecuyer. Applications, which do not use the bitwise mod 2 (XOR) operations, are very unlikely to interfere with the linear tests mod 2. On the other hand, if bitwise XOR is used, then Mersenne Twister numbers may be predicted due to the fact that it is defined using XOR operation and the history of the last 624 numbers. A simple demonstration of this known predictability is contained in http://www.cs.cas.cz/~savicky/predict_MT/predict_MT.R At the first glance, this may look as very bad. On the other hand, if there is a relatively simple smooth function of 625 real variables, which has a measurable difference of expected value on Mersenne Twister numbers and truly random ones, then this is likely to be an interesting mathematical discovery. Petr Savicky. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.