On 02/18/2015 04:04 AM, janhein.vanderb...@gmail.com wrote:
On Tuesday, February 17, 2015 at 3:35:16 PM UTC+1, Chris Angelico wrote:
Oh, incidentally: If you want a decent binary format for
variable-sized integer, check out the MIDI spec.

I did some time ago, thanks, and it is indeed a decent format.
I also looked at variations of that approach.
None of them beats

Define "beats." You might mean beats in simplicity, or in elegance, or in clarity of code. But you probably mean in space efficiency, or "compression." But that's meaningless without a target distribution of values that you expect to encode.

For example, if 99.9% of your values are going to be less than 255, then the most efficient byte encoding would be one that simply stores a value less than 255, and starts with an FF for larger values. It's almost irrelevant how it encodes those larger values.

On the other hand, if most values are going to be in the 10,000 to 20,000 bit size range, and a few will be much smaller, and a few will be very much larger, then it would be very practical to start with a size field, say 16 bits, followed by the raw packed data. Naturally, the size field would need to have an escape value that indicates a larger field was needed. In fact, the size field could be encoded in a 7bits-per-byte manner, so it would encode an arbitrary sized number as well.


"my" concept of two counters that cooperatively specify field lengths and 
represented integer values.

I've tried to read through the original algorithm description, but I'm
not entirely sure: How many payload bits per transmitted byte does it
actually achieve?

I don't think that payload bits per byte makes sense in this concept.


Correct.  Presumably one means average payload bits per byte.

First one would have to define what the "standard" unencoded variable length integer format was. Then one could call that size the payload size. Then, in order to compute an average, one would have to specify an expected, or target distribution of values. One then compares and averages the payload size for each typical value with the encoded size.

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to