Re: python implementation of a new integer encoding algorithm.

Dave Angel Wed, 18 Feb 2015 14:22:24 -0800

On 02/18/2015 02:55 PM, janhein.vanderb...@gmail.com wrote:

Op woensdag 18 februari 2015 17:47:49 UTC+1 schreef Dave Angel:

On 02/18/2015 03:59 AM, janhein.vanderb...@gmail.com wrote:

encoding individual integers optimally without any assumptions about their 
values.


Contradiction in terms.

--
DaveA


Not.
Jan-Hein.

Then you had better define your new word "optimal" to us. I decided totry your algorithm for all the values between 0 and 999999. One millionvalues, and the 7bit encoding[1] beat yours for 950081 of them. Therest were tied. Yours never was shorter.

For a uniform distribution of those particular values, the averagenumber of bytes used by yours was 3.933568 bytes, and by 7bit encodingwas 2.983487

For the second and third million, yours are all 4 bytes, while 7bit uses3. Beyond 2097152, 7bit uses 4 bytes, same as you. Between 16 and 17million, you average 4.156865, while 7bit is a constant 4.0.

After that, I started spot-checking. I went up to 100 billion, and fornone of those I tried did your algorithm take fewer bytes than 7bit.



So how is yours optimal?  Over what range of values?

I'm not necessarily doubting it, just challenging you to provide a datasample that actually shows it. And of course, I'm not claiming that7bit is in any way optimal. You cannot define optimal without firstdefining the distribution.

[1] by 7bit, I'm referring to the one apparently used in MIDI encoding,where 7 bits of each byte hold the value, and the 8th bit is zero,except for the last byte, where the 8th bit is one. So 3 bytes canencode 21 bits, or up to 2**21 - 1.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Re: python implementation of a new integer encoding algorithm.

Reply via email to