On Thu, Feb 19, 2015 at 12:54 AM, Dave Angel <da...@davea.name> wrote: >>> I've tried to read through the original algorithm description, but I'm >>> not entirely sure: How many payload bits per transmitted byte does it >>> actually achieve? >> >> >> I don't think that payload bits per byte makes sense in this concept. >> > > Correct. Presumably one means average payload bits per byte. > > First one would have to define what the "standard" unencoded variable length > integer format was. Then one could call that size the payload size. Then, > in order to compute an average, one would have to specify an expected, or > target distribution of values. One then compares and averages the payload > size for each typical value with the encoded size.
Average, or separately for different ranges. Alternatively, identify which ranges of numbers can be encoded with how many bytes. For instance, if you have a varlen length (in bytes) followed by that many packed bytes, you would have: 2: up to 1<<8-1 3: up to 1<<16-1 4: up to 1<<24-1 ... 128: up to 1<<1016-1 130: up to 1<<1024-1 131: up to 1<<1032-1 etc Using varlen directly gives: 1: up to 1<<7-1 2: up to 1<<14-1 3: up to 1<<21-1 etc So if your number is around about 1<<20, you know it's going to take 3 bytes to encode as varlen, or 4 to encode with the meta-length. Easy comparisons. I'm not sure how this algorithm compares, because it's extremely clever. ChrisA -- https://mail.python.org/mailman/listinfo/python-list