On Thu, Feb 19, 2015 at 11:24 AM, Dave Angel <da...@davea.name> wrote: > Here's a couple of ranges of output, showing that the 7bit scheme does > better for values between 384 and 16379. > > 382 2 80fe --- 2 7e82 > 383 2 80ff --- 2 7f82 > 384 3 810000 --- 2 0083 > 384 jan grew 3 810000 > 385 3 810001 --- 2 0183 > 386 3 810002 --- 2 0283 > 387 3 810003 --- 2 0383 > 388 3 810004 --- 2 0483 > 389 3 810005 --- 2 0583 > > 16380 3 813e7c --- 2 7cff > 16380 jan grew 3 813e7c > 16380 7bit grew 2 7cff > 16381 3 813e7d --- 2 7dff > 16382 3 813e7e --- 2 7eff > 16383 3 813e7f --- 2 7fff > 16384 3 813e80 --- 3 000081 > 16384 7bit grew 3 000081 > 16385 3 813e81 --- 3 010081 > 16386 3 813e82 --- 3 020081 > 16387 3 813e83 --- 3 030081 > 16388 3 813e84 --- 3 040081 > 16389 3 813e85 --- 3 050081 > > In all my experimenting, I haven't found any values where the 7bit scheme > does worse. It seems likely that for extremely large integers, it will, but > if those are to be the intended distribution, the 7bit scheme could be > replaced by something else, like just encoding a length at the beginning, > and using raw bytes after that.
It looks like you're counting whole bytes, not bits. That would be important since the "difficult" encoding uses fractional bytes. -- https://mail.python.org/mailman/listinfo/python-list