On 02/19/2015 01:32 PM, Ian Kelly wrote:
On Thu, Feb 19, 2015 at 11:24 AM, Dave Angel <da...@davea.name> wrote:
Here's a couple of ranges of output, showing that the 7bit scheme does
better for values between 384 and 16379.
382 2 80fe --- 2 7e82
383 2 80ff --- 2 7f82
384 3 810000 --- 2 0083
384 jan grew 3 810000
385 3 810001 --- 2 0183
386 3 810002 --- 2 0283
387 3 810003 --- 2 0383
388 3 810004 --- 2 0483
389 3 810005 --- 2 0583
16380 3 813e7c --- 2 7cff
16380 jan grew 3 813e7c
16380 7bit grew 2 7cff
16381 3 813e7d --- 2 7dff
16382 3 813e7e --- 2 7eff
16383 3 813e7f --- 2 7fff
16384 3 813e80 --- 3 000081
16384 7bit grew 3 000081
16385 3 813e81 --- 3 010081
16386 3 813e82 --- 3 020081
16387 3 813e83 --- 3 030081
16388 3 813e84 --- 3 040081
16389 3 813e85 --- 3 050081
In all my experimenting, I haven't found any values where the 7bit scheme
does worse. It seems likely that for extremely large integers, it will, but
if those are to be the intended distribution, the 7bit scheme could be
replaced by something else, like just encoding a length at the beginning,
and using raw bytes after that.
It looks like you're counting whole bytes, not bits. That would be
important since the "difficult" encoding uses fractional bytes.
Not the implementation that was shared. I've only seen one set of
Python code for "difficult", and it was strictly bytes. As i said
earlier in the message you quoted from.
Naturally, I question whether the original description makes sense for
sub-bytes, since it was claimed that these are NOT for lists or
sequences of arbitrary integers, but only for a single one at a time.
Presumably mixed with other data which may or may not like bit encoding.
--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list