New submission from Oren Milman:

------------ current state ------------
1. In Objects/longobject.c in _PyLong_AsUnsignedLongMask, in case v is a 
multiple-digit int, _PyLong_AsUnsignedLongMask iterates over all of its digits 
(going from the most to the least significant digit) and does (for each digit) 
'x = (x << PyLong_SHIFT) | v->ob_digit[i];'.
However, iterating over digits other than the 'ceil(sizeof(x) * 8.0 / 
PyLong_SHIFT)' least significant digits is redundant, because their bits would 
be shifted out of x anyway.

With regard to relevant changes made in the past, the iteration over all of the 
digits of a multiple-digit v remained quite the same since 
_PyLong_AsUnsignedLongMask was added, in revision 28652.

2. In Modules/_testcapimodule.c, the function comment of test_k_code, and an 
error string inside that function, contain mistakes.

With regard to relevant changes made in the past, these mistakes were there 
since test_k_code was added, in revision 28652.


------------ proposed changes ------------
1. In _PyLong_AsUnsignedLongMask, in case v is a multiple-digit int, iterate 
only over the 'Py_MIN(i, sizeof(x) * 8 / PyLong_SHIFT + 1)' least significant 
digits.

Obviously, 'sizeof(x) * 8 / PyLong_SHIFT + 1' might be off by one in case 
CPython is compiled with a PyLong_SHIFT other than 15 or 30. I suspect it's an 
overkill, but I added an assert, just in case.

2. Fix the function comment of test_k_code, and an error string inside that 
function.


------------ alternative changes ------------
1. I have also considered iterating only over the 'Py_MIN(i, 
(Py_ssize_t)ceil(sizeof(x) * 8.0 / PyLong_SHIFT))' least significant digits. 
Even though this number of digits is guaranteed to be accurate, it cannot be 
calculated at compile time, so it would probably cause the optimization to 
overall introduce a performance penalty.


------------ diff ------------
The proposed patches diff file is attached.


------------ tests ------------
I built the patched CPython for x86, and played with it a little. Everything 
seemed to work as usual. 

In addition, I ran 'python_d.exe -m test -j3' (on my 64-bit Windows 10) with 
and without the patches, and got quite the same output.
The outputs of both runs are attached.

----------
components: Interpreter Core
files: proposedPatches.diff
keywords: patch
messages: 268274
nosy: Oren Milman
priority: normal
severity: normal
status: open
title: redundant iteration over digits in _PyLong_AsUnsignedLongMask
type: performance
Added file: http://bugs.python.org/file43346/proposedPatches.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27298>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to