On 07.06.2015 08:27, Cecil Westerhof wrote: > I wrote a very simple function to test random: > def test_random(length, multiplier = 10000): > number_list = length * [0] > for i in range(length * multiplier): > number_list[random.randint(0, length - 1)] += 1 > minimum = min(number_list) > maximum = max(number_list) > return (minimum, maximum, minimum / maximum) > > When running: > for i in range(1, 7): > print(test_random(100, 10 ** i)) > I get: > (3, 17, 0.17647058823529413) > (73, 127, 0.5748031496062992) > (915, 1086, 0.8425414364640884) > (9723, 10195, 0.9537027954879843) > (99348, 100620, 0.9873583780560524) > (997198, 1002496, 0.9947151908835546) > > and when running: > for i in range(1, 7): > print(test_random(1000, 10 ** i)) > I get: > (2, 20, 0.1) > (69, 138, 0.5) > (908, 1098, 0.8269581056466302) > (9684, 10268, 0.9431242695753799) > (99046, 100979, 0.9808574059953059) > (996923, 1003598, 0.9933489305478886) > > It shows that when the first parameter increases the deviation > increases and when the second parameter increases the deviation > decreases. Exactly what you would expect. But what are the ranges you > would expect with a good random function.
Really depends on the number of samples. Appearantly, a good RNG would come close to 1.0 for the ratio. > Then it could be used to test a random function. This is an interesting test (more interesting to me than it looks at the first sight, and certainly better than what I had come up with), but unfortunately, there is more to testing randomness. The test clearly suggests that random functions should have a min/max ratio of about 1.0. Think of a "random" function like this: def fake_random(minv, maxv, _cache={}): try: return next(_cache[minv, maxv]) except (StopIteration, KeyError): iterator = iter(range(minv, maxv+1)) _cache[minv, maxv] = iterator return next(iterator) (if that is hard to read, I agree; it returns the sequence from minv to maxv (inclusively) over and over again for equal minv and maxv. don’t pass anything to _cache :), just call it like random.randint) This gives a "perfect" ratio of 1.0, but is not very random. This kind of function would probably be ruled out by a compression or correlation test. If you want to dig deeper into the algorithms for random testing, the dieharder test suite [1] is probably worth a look. It all boils down to the use case. For some use cases, the ``fake_random`` might be good enough (unittesting would be such a case: it is certainly uniformly distributed and allows full coverage of the tested range), for others it would fail catastrophically (don’t generate your cryptographic keys with this!). cheers, Jonas [1]: http://www.phy.duke.edu/~rgb/General/dieharder.php
signature.asc
Description: OpenPGP digital signature
-- https://mail.python.org/mailman/listinfo/python-list