New submission from Sergey Fedoseev <fedoseev.ser...@gmail.com>:
PyLong_As*() functions computes result for large longs like this: size_t x, prev; x = 0; while (--i >= 0) { prev = x; x = (x << PyLong_SHIFT) | v->ob_digit[i]; if ((x >> PyLong_SHIFT) != prev) { *overflow = sign; goto exit; } } It can be rewritten like this: size_t x = 0; while (--i >= 0) { if (x > (size_t)-1 >> PyLong_SHIFT) { goto overflow; } x = (x << PyLong_SHIFT) | v->ob_digit[i]; } This provides some speed-up: PyLong_AsSsize_t() $ python -m perf timeit -s "from struct import Struct; N = 1000; pack = Struct('n'*N).pack; values = (2**30,)*N" "pack(*values)" --compare-to=../cpython-master/venv/bin/python /home/sergey/tmp/cpython-master/venv/bin/python: ..................... 9.69 us +- 0.02 us /home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 8.61 us +- 0.07 us Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 9.69 us +- 0.02 us -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 8.61 us +- 0.07 us: 1.12x faster (-11%) PyLong_AsSize_t() $ python -m perf timeit -s "from struct import Struct; N = 1000; pack = Struct('N'*N).pack; values = (2**30,)*N" "pack(*values)" --compare-to=../cpython-master/venv/bin/python /home/sergey/tmp/cpython-master/venv/bin/python: ..................... 10.5 us +- 0.1 us /home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 8.19 us +- 0.17 us Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 10.5 us +- 0.1 us -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 8.19 us +- 0.17 us: 1.29x faster (-22%) PyLong_AsLong() $ python -m perf timeit -s "from struct import Struct; N = 1000; pack = Struct('l'*N).pack; values = (2**30,)*N" "pack(*values)" --compare-to=../cpython-master/venv/bin/python /home/sergey/tmp/cpython-master/venv/bin/python: ..................... 9.68 us +- 0.02 us /home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 8.48 us +- 0.22 us Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 9.68 us +- 0.02 us -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 8.48 us +- 0.22 us: 1.14x faster (-12%) PyLong_AsUnsignedLong() $ python -m perf timeit -s "from struct import Struct; N = 1000; pack = Struct('L'*N).pack; values = (2**30,)*N" "pack(*values)" --compare-to=../cpython-master/venv/bin/python /home/sergey/tmp/cpython-master/venv/bin/python: ..................... 10.5 us +- 0.1 us /home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 8.41 us +- 0.26 us Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 10.5 us +- 0.1 us -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 8.41 us +- 0.26 us: 1.25x faster (-20%) The mentioned pattern is also used in PyLong_AsLongLongAndOverflow(), but I left it untouched since the proposed change doesn't seem to affect its performance. ---------- components: Interpreter Core messages: 350091 nosy: sir-sigurd priority: normal severity: normal status: open title: speed-up PyLong_As*() for large longs type: performance versions: Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue37907> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com