New submission from Sergey Fedoseev <fedoseev.ser...@gmail.com>:

PyLong_As*() functions computes result for large longs like this:

size_t x, prev;
x = 0;
while (--i >= 0) {
    prev = x;
    x = (x << PyLong_SHIFT) | v->ob_digit[i];
    if ((x >> PyLong_SHIFT) != prev) {
        *overflow = sign;
        goto exit;
    }
}

It can be rewritten like this:

size_t x = 0;
while (--i >= 0) {
    if (x > (size_t)-1 >> PyLong_SHIFT) {
        goto overflow;
    }
    x = (x << PyLong_SHIFT) | v->ob_digit[i];
}

This provides some speed-up:

PyLong_AsSsize_t()
$ python -m perf timeit -s "from struct import Struct; N = 1000; pack = 
Struct('n'*N).pack; values = (2**30,)*N" "pack(*values)" 
--compare-to=../cpython-master/venv/bin/python
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 9.69 us 
+- 0.02 us
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 8.61 us +- 
0.07 us
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 9.69 us +- 
0.02 us -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 8.61 us +- 0.07 us: 
1.12x faster (-11%)

PyLong_AsSize_t()
$ python -m perf timeit -s "from struct import Struct; N = 1000; pack = 
Struct('N'*N).pack; values = (2**30,)*N" "pack(*values)" 
--compare-to=../cpython-master/venv/bin/python
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 10.5 us 
+- 0.1 us
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 8.19 us +- 
0.17 us
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 10.5 us +- 
0.1 us -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 8.19 us +- 0.17 us: 
1.29x faster (-22%)

PyLong_AsLong()
$ python -m perf timeit -s "from struct import Struct; N = 1000; pack = 
Struct('l'*N).pack; values = (2**30,)*N" "pack(*values)" 
--compare-to=../cpython-master/venv/bin/python
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 9.68 us 
+- 0.02 us
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 8.48 us +- 
0.22 us
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 9.68 us +- 
0.02 us -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 8.48 us +- 0.22 us: 
1.14x faster (-12%)

PyLong_AsUnsignedLong()
$ python -m perf timeit -s "from struct import Struct; N = 1000; pack = 
Struct('L'*N).pack; values = (2**30,)*N" "pack(*values)" 
--compare-to=../cpython-master/venv/bin/python
/home/sergey/tmp/cpython-master/venv/bin/python: ..................... 10.5 us 
+- 0.1 us
/home/sergey/tmp/cpython-dev/venv/bin/python: ..................... 8.41 us +- 
0.26 us
Mean +- std dev: [/home/sergey/tmp/cpython-master/venv/bin/python] 10.5 us +- 
0.1 us -> [/home/sergey/tmp/cpython-dev/venv/bin/python] 8.41 us +- 0.26 us: 
1.25x faster (-20%)

The mentioned pattern is also used in PyLong_AsLongLongAndOverflow(), but I 
left it untouched since the proposed change doesn't seem to affect its 
performance.

----------
components: Interpreter Core
messages: 350091
nosy: sir-sigurd
priority: normal
severity: normal
status: open
title: speed-up PyLong_As*() for large longs
type: performance
versions: Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37907>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to