Jurjen N.E. Bos <j...@users.sourceforge.net> added the comment:
Some more information for the interested: The algorithm I made tries to smoothly change the"chunk size" with growing length of the exponent. So the exponents that win the most (up to 14% fewer multiplication) are the long exponents that are just shorter than the FIVEARY_CUTOFF. But, I worked hard to make an algorithm that also saves multiplications for shorter numbers. Numbers of up to about 20 bits will be using the optimal chunk size. And, of course, the decision must be made quickly because for some frequently occurring parameters (e.g., 3**25), the routine doesn't take that long anyway. This is obtained by checking two properties of the exponent that strongly influence the addition chain: the higher four bits, and (surprise!) the number of pairs of bits with distance 2: in other words, (n&n>>2).bit_count(). After days of trying out all kinds of heuristics, and days of crunching,I measured the optimal parameters. I added the code I used to do that. Guido may remember that I wrote a chapter in my Ph.D. on the subject of addition chains. The interesting thing is that I then used Python for that too: that was almost 30 years ago! When implementing, I discovered that lots of the code around it had been in flux, so I didn't manage to "cherry pick" it into 3.10 yet. (One example: the bit_length and bit_count routines were renamed and moved around). And, I don't have windows 10:-( But anyway, have a look and let me hear what you think of it. I'll also want to test and measure it a bit more, but I am sure it is quite stable. ---------- Added file: https://bugs.python.org/file49738/longobject.py _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue42911> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com