On Thu, 11 Apr 2013 11:45:31 +1000, Chris Angelico wrote: > On Thu, Apr 11, 2013 at 11:21 AM, gry <georgeryo...@gmail.com> wrote: >> avail_chrs = >> '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%& >> \'()*+,-./:;<=>?@[\\]^_`{}' > > Is this exact set of characters a requirement? For instance, would it be > acceptable to instead use this set of characters? > > avail_chrs = > 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/' > > Your alphabet has 92 characters, this one only 64... the advantage is > that it's really easy to work with a 64-character set; in fact, for this > specific set, it's the standard called Base 64, and Python already has a > module for working with it. All you need is a random stream of eight-bit > characters, which can be provided by os.urandom().
I was originally going to write that using the base64 module would introduce bias into the random strings, but after a little investigation, I don't think it does. Or at least, if it does, it's a fairly subtle bias, and not detectable by the simple technique I used: inspect the mean, and the mean deviation from the mean. from os import urandom from base64 import b64encode data = urandom(1000000) m = sum(data)/len(data) md = sum(abs(v - m) for v in data)/len(data) print("Mean and mean deviation of urandom:", m, md) encoded = b64encode(data).strip(b'=') chars = (b'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef' b'ghijklmnopqrstuvwxyz0123456789+/') values = [chars.index(v) for v in encoded] m = sum(values)/len(values) md = sum(abs(v - m) for v in values)/len(values) print("Mean and mean deviation of encoded data:", m, md) When I run this, it prints: Mean and mean deviation of urandom: 127.451652 63.95331188965717 Mean and mean deviation of encoded data: 31.477027511486245 15.991177272527072 I would expect 127 64 and 32 16, so we're pretty close. That's not to say that there aren't any other biases or correlations in the encoded data, but after a simplistic test, it looks okay to me. -- Steven -- http://mail.python.org/mailman/listinfo/python-list