New submission from Antoine Pitrou <pit...@free.fr>: Here is a patch to speedup utf8 decoding. On a 64-bit build, the maximum speedup is around 30%, and on a 32-bit build around 15%. (*)
The patch may look disturbingly trivial, and I haven't studied the assembler output, but I think it is explained by the fact that having a separate loop counter breaks the register dependencies (when the 's' pointer was incremented, other operations had to wait for the incrementation to be committed). [side note: utf8 encoding is still much faster than decoding, but it may be because it allocates a smaller object, regardless of the iteration count] The same principle can probably be applied to the other decoding functions in unicodeobject.c, but first I wanted to know whether the principle is ok to apply. Marc-André, what is your take? (*) the benchmark I used is: ./python -m timeit -s "import codecs;c=codecs.utf_8_decode;s=b'abcde'*1000" "c(s)" More complex input also gets a speedup, albeit a smaller one (~10%): ./python -m timeit -s "import codecs;c=codecs.utf_8_decode;s=b'\xc3\xa9\xe7\xb4\xa2'*1000" "c(s)" ---------- components: Interpreter Core files: utf8decode.patch keywords: patch messages: 79338 nosy: lemburg, pitrou priority: normal severity: normal stage: patch review status: open title: Faster utf-8 decoding type: performance versions: Python 3.1 Added file: http://bugs.python.org/file12638/utf8decode.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue4868> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com