[issue43053] Speed up math.isqrt, again

2021-01-30 Thread Juraj Sukop
Juraj Sukop added the comment: Mark, thank you very much for the thorough reply! I believe it is not for me to say whether the complication is worth it or not. What really huge numbers goes, if I'm not mistaken, the original `isqrt` is already "fast" and only constant speed

[issue43053] Speed up math.isqrt, again

2021-01-28 Thread Juraj Sukop
Juraj Sukop added the comment: The rounding down of `l` might compute more than half of the bits so that the final Heron' step in `isqrt_2` might correct the uncertain low bit if `a - (a*a > n)` is missing from `isqrt`. As it currently stands, `a - (a*a > n)` is computed both in

[issue43053] Speed up math.isqrt, again

2021-01-28 Thread Juraj Sukop
Juraj Sukop added the comment: What the proof goes, you did most of the work already. Consider the following: l = (n.bit_length() - 1)//4 a = isqrt(n >> 2*l) a = ((a << l) + n//(a << l))//2 return a - (a*a > n) This computes the square root of the (possib

[issue43053] Speed up math.isqrt, again

2021-01-28 Thread Juraj Sukop
New submission from Juraj Sukop : This is a follow up to https://bugs.python.org/issue36887 and https://bugs.python.org/issue36957 . The new `isqrt` is remarkably simple but it does not split the number at hand optimally. Ideally one would want to have 2n/n division everywhere but since the

[issue29282] Fused multiply-add: proposal to add math.fma()

2020-01-13 Thread Juraj Sukop
Juraj Sukop added the comment: FWIW, there is a new implementation of FMA [1] which is licensed very permissively [2]. Perhaps it could be used here as well..? [1] https://github.com/smasher164/fma [2] https://github.com/smasher164/fma/commit/4e85d2388c7d4d850be12df918f9431ca687f57a

[issue29282] Fused multiply-add: proposal to add math.fma()

2017-01-16 Thread Juraj Sukop
Juraj Sukop added the comment: I would say because it has wide applicability, especially considering the amount of code it adds. It is similar in spirit to `copysign` or `fsum` which are already there and C99 includes it anyway. For simpler things like dot product or polynomial evaluation

[issue29282] Fused multiply-add: proposal to add math.fma()

2017-01-16 Thread Juraj Sukop
Changes by Juraj Sukop : Added file: http://bugs.python.org/file46300/setup.py ___ Python tracker <http://bugs.python.org/issue29282> ___ ___ Python-bugs-list mailin

[issue29282] Fused multiply-add: proposal to add math.fma()

2017-01-16 Thread Juraj Sukop
Changes by Juraj Sukop : Added file: http://bugs.python.org/file46298/xmathmodule.c ___ Python tracker <http://bugs.python.org/issue29282> ___ ___ Python-bugs-list mailin

[issue29282] Fused multiply-add: proposal to add math.fma()

2017-01-16 Thread Juraj Sukop
Changes by Juraj Sukop : Added file: http://bugs.python.org/file46299/example.py ___ Python tracker <http://bugs.python.org/issue29282> ___ ___ Python-bugs-list mailin

[issue29282] Fused multiply-add: proposal to add math.fma()

2017-01-16 Thread Juraj Sukop
New submission from Juraj Sukop: Fused multiply-add (henceforth FMA) is an operation which calculates the product of two numbers and then the sum of the product and a third number with just one floating-point rounding. More concretely: r = x*y + z The value of `r` is the same as if the