On Wed, 24 Feb 2010 20:16:24 +0100, mk <mrk...@gmail.com> wrote: > On 2010-02-24 20:01, Robert Kern wrote: >> I will repeat my advice to just use random.SystemRandom.choice() instead >> of trying to interpret the bytes from /dev/urandom directly. > > Out of curiosity: > > def gen_rand_string(length): > prng = random.SystemRandom() > chars = [] > for i in range(length): > chars.append(prng.choice('abcdefghijklmnopqrstuvwxyz')) > return ''.join(chars) > > if __name__ == "__main__": > chardict = {} > for i in range(10000): > ## w = gen_rand_word(10) > w = gen_rand_string(10) > count_chars(chardict, w) > counts = list(chardict.items()) > counts.sort(key = operator.itemgetter(1), reverse = True) > for char, count in counts: > print char, count > > > s 3966 > d 3912 > g 3909 > h 3905 > a 3901 > u 3900 > q 3891 > m 3888 > k 3884 > b 3878 > x 3875 > v 3867 > w 3864 > y 3851 > l 3825 > z 3821 > c 3819 > e 3819 > r 3816 > n 3808 > o 3797 > f 3795 > t 3784 > p 3765 > j 3730 > i 3704 > > Better, although still not perfect.
What would be perfect? Surely one shouldn't be happy if all the tallies come out exactly equal: that would be a blatant indication of something very nonrandom going on. The tallies given above give a chi-squared value smack in the middle of the range expected for random sampling of a uniform distribution (p = 0.505). So the chi-squared metric of goodness-of-fit to a unifom distribution says you're doing fine. -- To email me, substitute nowhere->spamcop, invalid->net. -- http://mail.python.org/mailman/listinfo/python-list