On Sun, 17 Sep 2017 08:43 pm, Chris Angelico wrote: > On Sun, Sep 17, 2017 at 5:54 PM, Steve D'Aprano > <steve+pyt...@pearwood.info> wrote: >> To even *know* that there are branches of maths where int/int isn't defined, >> you need to have learned aspects of mathematics that aren't even taught in >> most undergrad maths degrees. (I have a degree in maths, and if we ever >> covered areas where int/int was undefined, it was only briefly, and I've long >> since forgotten it.) > > How about this: > >>>> (1<<10000)/2 > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > OverflowError: integer division result too large for a float > > int/int is now undefined.
No, it's perfectly defined: you get an overflow error if the arguments are too big to convert, or an underflow error if the denominator is too small, or a divide by zero error if you divide by zero... What do you make of this? py> float(1<<10000)/2.0 Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: int too large to convert to float Would you like to argue that this shows that coercing ints to floats is "undefined"? Overflow and underflow errors are limitations of the float data type. We could fix that in a couple of ways: - silently underflow to zero (as Python already does!) or infinity, as needed; - use a bigger ~~boat~~ float; - or even an arbitrary precision float; - or return a rational number (fraction or similar); - or introduce a float context that allows you to specify the behaviour that you want, as the decimal module does. There may be other solutions I haven't thought of. But these will do. The distinction between Python floats and real numbers ℝ is a red-herring. It isn't relevant. > In Py2, it perfectly correctly returns > another integer (technically a long), but in Py3, it can't return a > float, so it errors out. Apart from your "correctly", which I disagree with, that's a reasonable description. The problem is that your example returns the correct result by accident. Forget such ludicrously large values, and try something more common: 1/2 Most people aren't expecting integer division, but true division, and silently returning the wrong result (0 instead of 0.5) is a silent source of bugs. This isn't some theoretical problem that might, maybe, perhaps, be an issue for some people sometimes. It was a regular source of actual bugs leading to code silently returning garbage. > This is nothing to do with the mathematical > notion of a "real", I don't believe I ever mentioned Reals. I was pretty careful not to. > which is a superset of the mathematical notion of > an "integer"; it's all to do with the Python notion of a "float", > which is NOT a superset of the Python notion of an "integer". So? Operations don't *have* to return values from their operands' type. len('abc') doesn't return a string. alist.find(1) doesn't have to return either a list or an int. And 1/2 doesn't have to return an int. Why is this such a big deal? > In Python 2, an ASCII string could be implicitly promoted to a Unicode string: > >>>> user_input = u"Real Soon Now™" >>>> print("> " + user_input + " <") >> Real Soon Now™ < And that was a bug magnet, like using / for integer division sometimes and true division other times was a big magnet. So Python 3 got rid of both bad design decisions. > In Python 2 and 3, a small integer can be implicitly promoted to float: > >>>> user_input = 3.14159 >>>> print(user_input + 1) > 4.14159 Yes, as it should. Why force the user to call float() on one argument when the interpreter can do it? What advantage is there? Can you demonstrate any failure of dividing two ints n/m which wouldn't equally fail if you called float(n)/float(m)? I don't believe that there is any such failure mode. Forcing the user to manually coerce to floats doesn't add any protection. > Both conversions can cause data-dependent failures when used with > arbitrary input, There's a difference: - with automatic promotion of bytes to Unicode, you get errors that pass silently and garbage results; - with automatic promotion of bytes to Unicode, you get errors that pass silently and garbage results; - but with true division, if int/int cannot be performed using floats, you get an explicit error. Silently returning the wrong result was a very common consequence of the int/int behaviour in Python 2. Is there any evidence of common, real-world bugs caused by true division? Beginners who make assumptions that Python is C (or any other language) and use / when they should use // don't count: that's no different from somebody using ^ for exponentiation. [...] > The trouble only comes when you take two pieces of user input in > different types, and try to combine them: > >>>> user_1 = float("1.234") >>>> user_2 = int("9"*999) # imagine someone typed it >>>> user_1 + user_2 > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > OverflowError: int too large to convert to float I'm sorry, I fail to see why you think this is "trouble". It's just normal Python behaviour in the face of errors: raise an exception. If you pass a bad value, you get an exception of some kind. Are these "trouble" too? py> ''[5] Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: string index out of range py> int('xyz') Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: invalid literal for int() with base 10: 'xyz' Getting an explicit exception on error is the right thing to do. Silently returning garbage is not. If you want to argue that int/int should return infinity, or a NAN, on overflow, that's possibly defensible. But arguing that somehow the division operator is uniquely or specifically "trouble" because it raises an exception when given bad data, well, that's just weird. > Python 3 introduces a completely different way to get failure, though. > You can be 100% consistent with your data types, but then get > data-dependent failures if, and only if, you divide. Its true that most operations on integers will succeed. But not all. Try (1<<10000)**(1<<10000) if you really think that integer ops are guaranteed to succeed. (I'm scared to try it myself, because I've had bad experiences in the past with unreasonably large ints.) But then, what of it? All that means is that division can fail. But even integer division can fail: py> 1//0 Traceback (most recent call last): File "<stdin>", line 1, in <module> ZeroDivisionError: integer division or modulo by zero [...] > I don't know of any exploits > that involve this, but I can imagine that you could attack a Python > script by forcing it to go floating-point, then either crashing it > with a huge integer, or exploiting round-off, depending on whether the > program is assuming floats or assuming ints. You're not seriously arguing that true division is a security vulnerability? In any case, the error here is an exception, not silent failures. "I find it amusing when novice programmers believe their main job is preventing programs from crashing. ... More experienced programmers realize that correct code is great, code that crashes could use improvement, but incorrect code that doesn’t crash is a horrible nightmare." -- Chris Smith Using / for integer division, if and only if both arguments are integers, was exactly that horrible nightmare. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse. -- https://mail.python.org/mailman/listinfo/python-list