Chris Angelico wrote: > On Sun, Jun 7, 2015 at 8:40 PM, Thomas 'PointedEars' Lahn > <pointede...@web.de> wrote: >> Cecil Westerhof wrote: >>> I wrote a very simple function to test random: >>> def test_random(length, multiplier = 10000): >>> number_list = length * [0] >>> for i in range(length * multiplier): >>> number_list[random.randint(0, length - 1)] += 1 >>> minimum = min(number_list) >>> maximum = max(number_list) >>> return (minimum, maximum, minimum / maximum) >> >> As there is no guarantee that every number will occur randomly, using a >> dictionary at first should be more efficient than a list: > > Hmm, I'm not sure that's actually so. His code is aiming to get > 'multiplier' values in each box; for any serious multiplier (he starts > with 10 in the main code), you can be fairly confident that every > number will come up at least once.
The wording shows a common misconception: that random distribution would mean that it is guaranteed or more probable that every element of the set will occur at least once. It is another common misconception that increasing the number of trials would increase the probability of that happening. But that is not so. The law of large numbers only says that as you increase the number of trials, that the relative frequency *approaches* the probability for each value of the probability variable, or IOW “the average of the results obtained from a large number of trials should be close to the expected value, and will *tend to become closer* as more trials are performed.” (<http://en.wikipedia.org/wiki/Law_of_large_numbers>; emphasis by me) > […] a true RNG could legitimately produce nothing but 7s for the entire > run, it's just extremely unlikely. That reasoning is precisely a result of the misconception described above. Because people think that every value must occur, they do not think it possible that (much) repetition could occur with a (pseudo-)random generator, and when they want to mince words they say “(extremely) unlikely” instead. For example, when people see 6 5 8 7 9 3 7 8 4 7 5 6 8 8 1 2 8 3 5 7 5 4 1 2 4 8 8 7 5 1 and are told that this is random sequence, they find it hard to believe. They think something like: “Look at all those repeated 8, and “7, 5” occurs twice. 4 occurs more often than 2, and there are much more 5s than 1s. That cannot be possibly be random!” Yes, it *can*. I have just produced it with #------------------------------------------------------------------ import random print(" ".join([str(random.randint(1, 9)) for i in range(0, 30)]))' #------------------------------------------------------------------ But if you think this *Mersenne Twister* PRNG which “generates numbers with nearly uniform distribution and a large period” is flawed, use a proper die as RNG, and a sheet of paper to record the outcomes, and do the experiment. The outcome is not going to be different. If the distribution is even and the game is fair, every outcome *always* has the same probability of occurring. As a result, every sequence of outcomes of equal length *always* has the same probability of occurring, and the probability for a particular sequence of equal length does _not_ increase or decrease based on the number of occurrences of previous outcomes. Those are *independent* events. See also: <http://www.teacherlink.org/content/math/interactive/probability/numbersense/misconceptions/home.html>, in particular the section “Representativeness” -- PointedEars Twitter: @PointedEars2 Please do not cc me. / Bitte keine Kopien per E-Mail. -- https://mail.python.org/mailman/listinfo/python-list