Steven D'Aprano wrote: > How about some really random data? > > py> import string > py> data = ''.join(random.choice(string.ascii_letters) for i in > range(21000)) py> len(codecs.encode(data, 'bz2')) > 15220 > > That's actually better than I expected: it's found some redundancy and > saved about a quarter of the space.
It didn't find any redundancy, it found the two unused bits: >>> math.log(len(string.ascii_letters), 2) 5.700439718141093 >>> 21000./8*_ 14963.654260120367 -- https://mail.python.org/mailman/listinfo/python-list