Terry J. Reedy added the comment: You stated facts: what is your proposal?
The fact that unicode calls characters 'space' does not make then whitespace as commonly understood, or as defined by C, or even as defined by the Unicode database. Unicode apparently has a WSpace property. According to the table in https://en.wikipedia.org/wiki/Whitespace_%28computer_science%29 1C - 1F are not included by that definition either. For ascii chars, that table matches the C definition, with \r included. So I think your implied proposal to treat them as whitespace (in strings but not bytes) should be rejected as invalid. For 3.x, the manual should specify that it follows the C definition of 'whitespace' (\r included) for bytes and the extended unicode definition for strings. >>> int('3\r') 3 >>> int('3\u00a0') 3 >>> int('3\u2000') 3 >>> int(b'3\r') 3 >>> int(b'3\u00a0') Traceback (most recent call last): File "<pyshell#10>", line 1, in <module> int(b'3\u00a0') ValueError: invalid literal for int() with base 10: '3\\u00a0' ---------- nosy: +terry.reedy type: behavior -> enhancement versions: +Python 3.4 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue18236> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com