On 8/4/2017 11:44 AM, Chris Angelico wrote:
On Sat, Aug 5, 2017 at 12:51 AM, Steve D'Aprano
<steve+pyt...@pearwood.info> wrote:
def isqrt_float(n):
"""Integer square root using floating point sqrt."""
return int(math.sqrt(n))
The operations in the integer version are well-defined and so the answer
should be the same on any Python 3 and indeed in any language that does
proper integer arithmetic (but most do not).
In CPython, Math.sqrt wraps the C compiler's double sqrt function. To
make it deterministic, we must specify the float representation,
rounding rule, and accuracy of the sqrt function. Let's assume that the
float answer is IEEE binary64, with 53 binary digits, with rounding rule
'round to nearest, ties to even' and that the algorithm is as exact as
possible, which is to say, the exact mathematical answer rounded as
specified.
With 53 binary digits, all counts from 0 to 2**53 - 1 are exactly
represented with a exponent of 0, 2**53 = 2**52 * 2, so it is exactly
represented with an exponent of 1. Many other higher counts can be
exactly represented with various exponents, but 2**53 + 1 requires 54
digits. So it is the first count that does not have an exact binary64
representation.
Under these assumptions, how can isqrt_float(n) fail?
Suppose 0 <= m <= 2**26.5, so that m*m <= 2**53. Under the assumptions
that m.sqrt is as exact as possible, sqrt(m*m) == float(m) and
int(float(m)) == m. The true square root of m*m -1 will be m - e (e <
1) and the integer square root is m-1. However, if e is less than (and
I believe equal) to 1/2 of the interval between n and the next lowest
float, m - e will be rounded up to float(m) and ifloat(m*m-1) will
incorrectly return m instead of m-1.
As m increases, e decreases while the interval between floats increases,
so we should expect an eventual error
In this range of m, sqrt(m*m +1) will always be greater than m, so
rounding down cannot lead to an error. For larger m, where float(m*m)
is inaccurate enough, it might.
As m increases, e decreases while the interval between floats increases,
so we should expect an eventual error. Whether it occurs for m <= 2* or
not might be determined by careful analysis, but Chris Angelico already
gave us the brute force answer.
We know that:
- for n <= 2**53, isqrt_float(n) is exact;
but it is not ;-). In this range, the only possibility is
math.sqrt(m*m-1) being m instead of m - delta and that is the case for
Chris' answer 4503599761588224 = 67108865**2 - 1.
int(math.float(4503599761588224) is 67108865 instead of 67108864.
- for n >= 2**1024, isqrt_float(n) will raise OverflowError >> - between those two
values, 2**53 < n < 2**1024, isqrt_float(n)
will sometimes be exact, and sometimes not exact;
- there is some value, let's call it M, which is the smallest
integer where isqrt_float is not exact.
Your mission, should you choose to accept it, is to find M.
Hmm. Thinking aloud a bit here. We know that isqrt_float(n) is not
technically *exact* for any n that is not a square. So what I'd do is
assume that for (n*n+1) to (n+1)*(n+1)-1, it's going to return the
same value. I would then probe every perfect square, and the values
one above and one below it. Now, that's still probably too many to
check, but it's going to be a lot faster; for starters, we don't
actually need to use isqrt_newton to compare against it.
for root in range(2**26, 2**53):
square = root * root
if isqrt_float(square - 1) != root - 1:
print(square - 1)
break
if isqrt_float(square) != root:
print(square)
break
if isqrt_float(square + 1) != root:
print(square + 1)
break
That gave me this result almost instantaneously:
4503599761588224
Your limit was barely low enough ;-).
>>> math.log2(math.sqrt(4503599761588224))
26.000000021497833
If you had started with, say int(2**26.5)-1, you would have missed this.
I reran with range(2, 2**26) and there is nothing lower.
>>> math.log2(4503599761588224)
52.00000004299566
>>> import decimal as d
>>> I = d.Decimal(4503599761588224)
>>> I.sqrt()
Decimal('67108864.99999999254941951410')
>>> float(I.sqrt())
67108865.0
which has been rounded up instead of down. I don't know if that counts
as sufficiently wrong?
The isqrt_float answer is definitely wrong, and so is the claim for n <
2**53.
isqrt_float(4503599761588224)
67108865
isqrt_newton(4503599761588224)
67108864
67108865 ** 2
4503599761588225
--
Terry Jan Reedy
--
https://mail.python.org/mailman/listinfo/python-list