On Apr 22, 12:04 am, Ivan Illarionov <[EMAIL PROTECTED]> wrote: > On Mon, 21 Apr 2008 16:10:05 -0700, George Sakkis wrote: > > On Apr 21, 5:30 pm, Ivan Illarionov <[EMAIL PROTECTED]> wrote: > > >> On 22 ÁÐÒ, 01:01, Peter Otten <[EMAIL PROTECTED]> wrote: > > >> > Ivan Illarionov wrote: > >> > > And even faster: > >> > > a = array.array('i', '\0' + '\0'.join((s[i:i+3] for i in xrange(0, > >> > > len(s), 3)))) > >> > > if sys.byteorder == 'little': > >> > > a.byteswap() > > >> > > I think it's a fastest possible implementation in pure python > > >> > Clever, but note that it doesn't work correctly for negative numbers. > >> > For those you'd have to prepend "\xff" instead of "\0". > > >> > Peter > > >> Thanks for correction. > > >> Another step is needed: > > >> a = array.array('i', '\0' + '\0'.join((s[i:i+3] for i in xrange(0, > >> len(s), 3)))) > >> if sys.byteorder == 'little': > >> a.byteswap() > >> result = [n if n < 0x800000 else n - 0x1000000 for n in a] > > >> And it's still pretty fast :) > > > Indeed, the array idea is paying off for largeish inputs. On my box > > (Python 2.5, WinXP, 2GHz Intel Core Duo), the cutoff point where > > from3Bytes_array becomes faster than from3Bytes_struct is close to 150 > > numbers (=450 bytes). > > > The struct solution though is now almost twice as fast with Psyco > > enabled, while the array doesn't benefit from it. Here are some numbers > > from a sample run: > > > *** Without Psyco *** > > size=1 > > from3Bytes_ord: 0.033493 > > from3Bytes_struct: 0.018420 > > from3Bytes_array: 0.089735 > > size=10 > > from3Bytes_ord: 0.140470 > > from3Bytes_struct: 0.082326 > > from3Bytes_array: 0.142459 > > size=100 > > from3Bytes_ord: 1.180831 > > from3Bytes_struct: 0.664799 > > from3Bytes_array: 0.690315 > > size=1000 > > from3Bytes_ord: 11.551990 > > from3Bytes_struct: 6.390999 > > from3Bytes_array: 5.781636 > > *** With Psyco *** > > size=1 > > from3Bytes_ord: 0.039287 > > from3Bytes_struct: 0.009453 > > from3Bytes_array: 0.098512 > > size=10 > > from3Bytes_ord: 0.174362 > > from3Bytes_struct: 0.045785 > > from3Bytes_array: 0.162171 > > size=100 > > from3Bytes_ord: 1.437203 > > from3Bytes_struct: 0.355930 > > from3Bytes_array: 0.800527 > > size=1000 > > from3Bytes_ord: 14.248668 > > from3Bytes_struct: 3.331309 > > from3Bytes_array: 6.946709 > > > And here's the benchmark script: > > > import struct > > from array import array > > > def from3Bytes_ord(s): > > return [n if n<0x800000 else n-0x1000000 for n in > > ((ord(s[i])<<16) | (ord(s[i+1])<<8) | ord(s[i+2]) > > for i in xrange(0, len(s), 3))] > > > unpack_i32be = struct.Struct('>l').unpack def from3Bytes_struct(s): > > return [unpack_i32be(s[i:i+3] + '\0')[0]>>8 > > for i in xrange(0,len(s),3)] > > > def from3Bytes_array(s): > > a = array('l', ''.join('\0' + s[i:i+3] > > for i in xrange(0,len(s), 3))) > > a.byteswap() > > return [n if n<0x800000 else n-0x1000000 for n in a] > > > def benchmark(): > > from timeit import Timer > > for n in 1,10,100,1000: > > print ' size=%d' % n > > # cycle between positive and negative buf = > > ''.join(struct.pack('>i', 1234567*(-1)**(i%2))[1:] > > for i in xrange(n)) > > for func in 'from3Bytes_ord', 'from3Bytes_struct', > > 'from3Bytes_array': > > print ' %s: %f' % (func, > > Timer('%s(buf)' % func , > > 'from __main__ import %s; buf=%r' % (func,buf) > > ).timeit(10000)) > > > if __name__ == '__main__': > > s = ''.join(struct.pack('>i',v)[1:] for v in > > [0,1,-2,500,-500,7777,-7777,-94496,98765, > > -98765,8388607,-8388607,-8388608,1234567]) > > assert from3Bytes_ord(s) == from3Bytes_struct(s) == > > from3Bytes_array(s) > > > print '*** Without Psyco ***' > > benchmark() > > > import psyco; psyco.full() > > print '*** With Psyco ***' > > benchmark() > > > George > > Comments: > You didn't use the faster version of array approach: > ''.join('\0' + s[i:i+3] for i in xrange(0,len(s), 3)) > is slower than > '\0' + '\0'.join(s[i:i+3] for i in xrange(0,len(s), 3))
Good catch; the faster version reduces the cutoff point between from3Bytes_array and from3Bytes_struct to ~50 numbers (=150 bytes) only (without Psyco). George -- http://mail.python.org/mailman/listinfo/python-list