Hi John, John Machin escribió:
>On 28/05/2006 12:10 AM, Gonzalo Monzón wrote: > >[good advice snipped] > > > >>Example A: >>This code is more than 80 times faster than a "easy" Python >>implementation. For every call, it does some bitwise operations and does >>an array lookup for every string character from argument. Its a lot >>faster because in Python approach a list lookup is done and it is a lot >>faster to do a C array lookup -thought that in these C loops no Python >>type value conversions are needed, if it where the case, C approach >>would not be so faster than python. I don't know how would perform an >>array based Python code, but I expect it to be a lot faster than using a >>list, so Python code can be speed up a lot if you know how to do it. >> >>// C code: >>int CRC16Table[256]; // Filled elsewhere >>int CalcCRC16(char *str) >>{ >> int crc; >> for(crc = 0xFFFF; *str != 0; str++) { >> crc = CRC16Table [(( crc >> 8 ) & 255 )] ^ ( crc << 8 ) ^ *str; >> >> > >Gonzalo, just in case there are any C compilers out there which need to >be told: > > > for(crc = 0xFFFF; *str != 0;) { > > crc = CRC16Table [(( crc >> 8 ) & 255 )] ^ ( crc << 8 ) ^ *str++; > > Thank you for the advise! I didn't know you couldn't advance pointer in the for in some compilers... > > > >> } >> return crc; >>} >> >># Python code >>gCRC16Table = [] # Filled elsewhere >>def CalcCRC16(astr): >> crc = 0xFFFFL >> >> > >Having that L on the end (plus the fact that you are pointlessly >maintaining "crc" as an *unsigned* 32-bit quantity) will be slowing the >calculation down -- Python will be doing it in long integers. You are >calculating a *sixteen bit* CRC! The whole algorithm can be written >simply so as to not need more than 16-bit registers, and not to pollute >high-order bits in 17-or-more-bit registers. > > > Yes I know but I plan to post a quick example for Jim, and got the first one file from several versions... :-) The issue was about Jim understanding how some code can be speed-up a lot and some other not and how that's not a trivial question. >> for c in astr: >> crc = gCRC16Table[((crc >> 8) & 255)] ^ ((crc & 0xFFFFFF) << 8) ^ >>ord(c) >> >> > >Note that *both* the C and Python routines still produce a 32-bit result >with 16 bits of high-order rubbish -- I got the impression from the >previous thread that you were going to fix that. > > Yes of course! I plan to spend some time on this issue, the last week I had not much time to work on this, but thought it worth the pain to setup a compiling environment -ms.evc++ obviously-, and got succesfuly compiled Python and some of these own custom Pyrex extensions for the PocketPC, easily, only adding the C files to makefile, as Pyrex glue code compiles well on ARM, so I have to make some timings and decide what version to use for the code that won't be likely to be changed in long time. I still have to test the last improved Python array based approach and make some timings on the PDA. >This Python routine never strays outside 16 bits, so avoiding your "& >255" and a final "& 0xFFFF" (which you don't have). > >def CalcCRC16(astr): > crc = 0xFFFF > for c in astr: > crc = gCRC16Table[crc >> 8] ^ ((crc & 0xFF) << 8) ^ ord(c) > return crc > > Thank you again for your thoughts John! :-) Regards, Gonzalo >============== >To the OP: > >I'd just like to point out that C code and Pyrex code can gain >signicantly (as the above example does) by not having to use ord() and >chr(). > >As Gonzalo says, read the generated C code. Look for other cases of >using Python built-ins that could be much faster with a minor bit of >effort in Pyrex e.g. "max(a, b)" -> "(a) > (b) ? (a) : (b) " or if you >don't like that, a cdef function to get the max of 2 ints will be *way* >faster than calling Python's max() > > -- http://mail.python.org/mailman/listinfo/python-list