New submission from Ma Lin <malin...@163.com>:
C type `long` is 4-byte integer in 64-bit Windows build. [1] But `ucs1lib_find_max_char()` function [2] uses SIZEOF_LONG, so it loses a little performance in 64-bit Windows build. Below is the benchmark of using SIZEOF_SIZE_T and this change: - unsigned long value = *(unsigned long *) _p; + sizt_t value = *(sizt_t *) _p; D:\dev\cpython\PCbuild\amd64\python.exe -m pyperf timeit -s "b=b'a'*10_000_000; f=b.decode;" "f('latin1')" before: 5.83 ms +- 0.05 ms after : 5.58 ms +- 0.06 ms [1] https://stackoverflow.com/questions/384502 [2] https://github.com/python/cpython/blob/v3.8.0b4/Objects/stringlib/find_max_char.h#L9 Maybe there can be more optimizations, so I didn't prepare a PR for this. ---------- components: Interpreter Core messages: 352970 nosy: Ma Lin, inada.naoki, serhiy.storchaka, sir-sigurd priority: normal severity: normal status: open title: micro-optimize ucs1lib_find_max_char in Windows 64-bit build type: performance versions: Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue38252> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com