New submission from Serhiy Storchaka <storch...@gmail.com>: Charmap decoders are not as important as UTF decoders, but are still widely used. In Python 3.3 with PEP 393 they slowed down 4x. The proposed patch restores the performance.
Optimized only the most common case, when the decoder is specified by the UCS2 table with length >= 256. Map-based decoders translated to table-based. UCS1 tables widened to UCS2 by adding 257th fake characters. Benchmark results: 3.2 3.3(vanilla) 3.3(patched) cp1251 'A'*10000 111 (+10%) 31 (+294%) 122 cp1251 '\xa0'*10000 111 (+8%) 29 (+314%) 120 cp1251 '\u0402'*10000 111 (+6%) 25 (+372%) 118 ---------- components: Interpreter Core, Unicode files: decode_charmap.patch keywords: patch messages: 161301 nosy: ezio.melotti, haypo, lemburg, pitrou, storchaka priority: normal severity: normal status: open title: Faster charmap decoding type: performance versions: Python 3.3 Added file: http://bugs.python.org/file25664/decode_charmap.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14874> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com