On Sun, Sep 17, 2017 at 5:54 PM, Steve D'Aprano <steve+pyt...@pearwood.info> wrote: > To even *know* that there are branches of maths where int/int isn't defined, > you > need to have learned aspects of mathematics that aren't even taught in most > undergrad maths degrees. (I have a degree in maths, and if we ever covered > areas where int/int was undefined, it was only briefly, and I've long since > forgotten it.)
How about this: >>> (1<<10000)/2 Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: integer division result too large for a float int/int is now undefined. In Py2, it perfectly correctly returns another integer (technically a long), but in Py3, it can't return a float, so it errors out. This is nothing to do with the mathematical notion of a "real", which is a superset of the mathematical notion of an "integer"; it's all to do with the Python notion of a "float", which is NOT a superset of the Python notion of an "integer". In Python 2, an ASCII string could be implicitly promoted to a Unicode string: >>> user_input = u"Real Soon Now™" >>> print("> " + user_input + " <") > Real Soon Now™ < In Python 2 and 3, a small integer can be implicitly promoted to float: >>> user_input = 3.14159 >>> print(user_input + 1) 4.14159 Both conversions can cause data-dependent failures when used with arbitrary input, but are unlikely to cause problems when you're promoting literals. Both conversions require proximity to the other type. As long as you're explicit about the data type used for user input, you can short-hand your literals and get away with it: >>> # more likely, input came as text >>> user_input = float("1.234") >>> print(user_input + 1) 2.234 >>> # and hey, it works with other types too! >>> user_input = decimal.Decimal("1.234") >>> print(user_input + 1) 2.234 >>> user_input = fractions.Fraction("1.234") >>> print(user_input + 1) 1117/500 The trouble only comes when you take two pieces of user input in different types, and try to combine them: >>> user_1 = float("1.234") >>> user_2 = int("9"*999) # imagine someone typed it >>> user_1 + user_2 Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: int too large to convert to float Solution? Always use the right data types for user input. Easy enough. Python 3 introduces a completely different way to get failure, though. You can be 100% consistent with your data types, but then get data-dependent failures if, and only if, you divide. (Technically, not completely new in Py3; you can get this in Py2 with exponentiation - "2**-1" will yield a float. Far less likely to be hit, but could potentially cause the same problems.) I don't know of any exploits that involve this, but I can imagine that you could attack a Python script by forcing it to go floating-point, then either crashing it with a huge integer, or exploiting round-off, depending on whether the program is assuming floats or assuming ints. Python 3 *removed* one of these data-dependent distinctions, by making bytes+text into an error: >>> b"asdf" + u"qwer" u'asdfqwer' >>> b"asdf" + u"qwer" Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't concat str to bytes But it added a different one, by allowing a common and normal operation to change a data type. Is it better to make things convenient for the case of small integers (the ones that are perfectly representable as floats), while potentially able to have problems on larger ones? Considering how large a "small integer" can be, most programmers won't think to test for overflow - just as many programmers won't test non-ASCII data. Thanks to Python 3, the "non-ASCII data" one isn't a problem, because you'll get the same exception with ASCII data as with any other; but the "small integer" one now is. Data-dependent type errors don't seem like a smart thing to me. ChrisA -- https://mail.python.org/mailman/listinfo/python-list