New submission from Mark Dickinson <dicki...@gmail.com>: Here's a patch that streamlines the x_divrem function in Objects/longobject.c. More benchmarks are needed, but the initial results are promising: for py3k, on a 32-bit machine with 15-bit digits, Victor's pidigits benchmark runs almost twice as fast with this patch (numbers below).
Patch details ============= - Normalize inputs by shifting instead of multiplying and dividing. This halves the number of divisions used in the algorithm. - Streamline innermost loop. - Save an iteration of outer loop around half the time. - Save an object allocation: only 3 allocations per x_divrem call instead of 4. - Remove special case where initial quotient estimate is >= PyLong_BASE. There's no need for this, since the 'digit' type holds values up to 2*PyLong_BASE - 1. - Make q type digit instead of type twodigits: this halves the size of the multiplication in the innermost loop. Benchmark results ================= Using the pidigits_bestof.py script that's posted in the issue 4258 discussion, on a non-debug build of py3k (r70465), on OS X 10.5.6/Core 2 Duo: Unpatched --------- Macintosh-3:py3k dickinsm$ ./python.exe ../pidigits_bestof.py 2000 performing a warm up run... running sys.int_info= sys.int_info(bits_per_digit=15, sizeof_digit=2) Time; 2234.6 ms Time; 2232.2 ms Time; 2227.9 ms Time; 2225.7 ms Time; 2229.8 ms Best Time; 2225.7 ms Patched ------- dickinsm$ ./python.exe ../pidigits_bestof.py 2000 performing a warm up run... running sys.int_info= sys.int_info(bits_per_digit=15, sizeof_digit=2) Time; 1175.6 ms Time; 1176.5 ms Time; 1177.3 ms Time; 1179.5 ms Time; 1168.5 ms Best Time; 1168.5 ms So the patch gives a speedup of around 90%. This particular benchmark is heavy on the divisions though, so I'd expect lesser speedups for other benchmarks. ---------- assignee: marketdickinson components: Interpreter Core files: faster_integer_division.patch keywords: patch messages: 83781 nosy: marketdickinson priority: normal severity: normal status: open title: Streamline integer division type: performance versions: Python 2.7, Python 3.1 Added file: http://bugs.python.org/file13368/faster_integer_division.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue5512> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com