Re: Testing random

Thomas 'PointedEars' Lahn Sun, 07 Jun 2015 08:56:54 -0700

Chris Angelico wrote:

> On Sun, Jun 7, 2015 at 8:40 PM, Thomas 'PointedEars' Lahn
> <pointede...@web.de> wrote:
>> Cecil Westerhof wrote:
>>> I wrote a very simple function to test random:
>>>     def test_random(length, multiplier = 10000):
>>>         number_list = length * [0]
>>>         for i in range(length * multiplier):
>>>             number_list[random.randint(0, length - 1)] += 1
>>>         minimum = min(number_list)
>>>         maximum = max(number_list)
>>>         return (minimum, maximum, minimum / maximum)
>>
>> As there is no guarantee that every number will occur randomly, using a
>> dictionary at first should be more efficient than a list:
> 
> Hmm, I'm not sure that's actually so. His code is aiming to get
> 'multiplier' values in each box; for any serious multiplier (he starts
> with 10 in the main code), you can be fairly confident that every
> number will come up at least once.

The wording shows a common misconception: that random distribution would
mean that it is guaranteed or more probable that every element of the set
will occur at least once. It is another common misconception that
increasing the number of trials would increase the probability of that
happening. But that is not so.

The law of large numbers only says that as you increase the number of
trials, that the relative frequency *approaches* the probability for each
value of the probability variable, or IOW “the average of the results
obtained from a large number of trials should be close to the expected
value, and will *tend to become closer* as more trials are performed.”
(<http://en.wikipedia.org/wiki/Law_of_large_numbers>; emphasis by me)

> […] a true RNG could legitimately produce nothing but 7s for the entire
> run, it's just extremely unlikely.

That reasoning is precisely a result of the misconception described above.
Because people think that every value must occur, they do not think it
possible that (much) repetition could occur with a (pseudo-)random
generator, and when they want to mince words they say “(extremely) unlikely”
instead. For example, when people see

6 5 8 7 9 3 7 8 4 7 5 6 8 8 1 2 8 3 5 7 5 4 1 2 4 8 8 7 5 1

and are told that this is random sequence, they find it hard to believe.
They think something like: “Look at all those repeated 8, and “7, 5”
occurs twice. 4 occurs more often than 2, and there are much more 5s than
1s. That cannot be possibly be random!”

Yes, it *can*. I have just produced it with

#------------------------------------------------------------------
import random
print(" ".join([str(random.randint(1, 9)) for i in range(0, 30)]))'
#------------------------------------------------------------------

But if you think this *Mersenne Twister* PRNG which “generates numbers with
nearly uniform distribution and a large period” is flawed, use a proper die
as RNG, and a sheet of paper to record the outcomes, and do the experiment.
The outcome is not going to be different.

If the distribution is even and the game is fair, every outcome *always* has
the same probability of occurring. As a result, every sequence of outcomes
of equal length *always* has the same probability of occurring, and the
probability for a particular sequence of equal length does _not_ increase or
decrease based on the number of occurrences of previous outcomes. Those are
*independent* events.

See also:
<http://www.teacherlink.org/content/math/interactive/probability/numbersense/misconceptions/home.html>,

in particular the section “Representativeness”

--
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
--
https://mail.python.org/mailman/listinfo/python-list

Re: Testing random

Reply via email to