[issue4868] Faster utf-8 decoding

2010-04-03 Thread Ezio Melotti
Changes by Ezio Melotti : -- nosy: +ezio.melotti ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pyt

[issue4868] Faster utf-8 decoding

2009-01-10 Thread Antoine Pitrou
Antoine Pitrou added the comment: I committed the patch with the last suggested change (word -> data) in py3k (r68483). I don't intend to backport it to trunk, but I suppose it wouldn't be too much work to do. -- resolution: -> fixed status: open -> closed

[issue4868] Faster utf-8 decoding

2009-01-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Antoine Pitrou wrote: > Antoine Pitrou added the comment: > > Marc-Andre, this patch should address your comments. > > Added file: http://bugs.python.org/file12656/decode6.patch Thanks. Much better ! BTW: I'd also change the variable name "word" to some

[issue4868] Faster utf-8 decoding

2009-01-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: Marc-Andre, this patch should address your comments. Added file: http://bugs.python.org/file12656/decode6.patch ___ Python tracker ___ _

[issue4868] Faster utf-8 decoding

2009-01-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Antoine Pitrou wrote: > Antoine Pitrou added the comment: > > Attached patch adds acceleration for latin1 and utf16 decoding as well. > > All three codecs (utf8, utf16, latin1) are now in the same ballpark > performance-wise on favorable input: on my mac

[issue4868] Faster utf-8 decoding

2009-01-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: (PS : performance measured on UCS-2 and UCS-4 builds with gcc, and under Windows with MSVC) ___ Python tracker ___ __

[issue4868] Faster utf-8 decoding

2009-01-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: Attached patch adds acceleration for latin1 and utf16 decoding as well. All three codecs (utf8, utf16, latin1) are now in the same ballpark performance-wise on favorable input: on my machine, they are able to decode at almost 1GB/s. (unpatched, it is between

[issue4868] Faster utf-8 decoding

2009-01-08 Thread Kevin Watters
Changes by Kevin Watters : -- nosy: +kevinwatters ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.py

[issue4868] Faster utf-8 decoding

2009-01-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Attached patch > (utf8decode4.patch) changes this and may enter the fast loop on the > first character. Thanks! > Does this idea apply to the encode function as well? Probably, although with less efficiency (a long can hold 1, 2 or 4 unicode characters depe

[issue4868] Faster utf-8 decoding

2009-01-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: Very nice! It seems that you can get slightly faster by not copying the initial char first: 's' is often already aligned at the beginning of the string, but not after the first copy... Attached patch (utf8decode4.patch) changes this and may enter the fast

[issue4868] Faster utf-8 decoding

2009-01-07 Thread Antoine Pitrou
Changes by Antoine Pitrou : Removed file: http://bugs.python.org/file12638/utf8decode.patch ___ Python tracker ___ ___ Python-bugs-list mailing

[issue4868] Faster utf-8 decoding

2009-01-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: Reopening and attaching a more ambitious patch, based on the optimization of runs of ASCII characters. This time the speedup is much more impressive, up to 75% faster on pure ASCII input -- actually faster than latin1. The worst case (tight interleaving of ASCI

[issue4868] Faster utf-8 decoding

2009-01-07 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 2009-01-07 16:25, Antoine Pitrou wrote: > New submission from Antoine Pitrou : > > Here is a patch to speedup utf8 decoding. On a 64-bit build, the maximum > speedup is around 30%, and on a 32-bit build around 15%. (*) > > The patch may look disturbingl

[issue4868] Faster utf-8 decoding

2009-01-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: Ha, the patch makes things slower on MSVC. The patch can probably be rejected, then. (and interestingly, MSVC produces 40% faster code than gcc on my mini-bench, despite the virtual machine overhead) -- resolution: -> rejected status: open -> closed

[issue4868] Faster utf-8 decoding

2009-01-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: As I said I don't think it's due to register allocation, but simply avoiding register write-to-read dependencies by using separate variables for the loop count and the pointer. I'm gonna try under Windows (in a virtual machine, but it shouldn't make much differe

[issue4868] Faster utf-8 decoding

2009-01-07 Thread Martin v. Löwis
Martin v. Löwis added the comment: Can you please upload it to Rietveld? I'm skeptical about changes that merely rely on the compiler's register allocator to do a better job. This kind of change tends to pessimize the code for other compilers, and also may pessimize it for future versions of th

[issue4868] Faster utf-8 decoding

2009-01-07 Thread Antoine Pitrou
New submission from Antoine Pitrou : Here is a patch to speedup utf8 decoding. On a 64-bit build, the maximum speedup is around 30%, and on a 32-bit build around 15%. (*) The patch may look disturbingly trivial, and I haven't studied the assembler output, but I think it is explained by the fact