Terry J. Reedy added the comment:

You stated facts: what is your proposal?

The fact that unicode calls characters 'space' does not make then whitespace as 
commonly understood, or as defined by C, or even as defined by the Unicode 
database. Unicode apparently has a WSpace property. According to the table in
https://en.wikipedia.org/wiki/Whitespace_%28computer_science%29
1C - 1F are not included by that definition either. For ascii chars, that table 
matches the C definition, with \r included.

So I think your implied proposal to treat them as whitespace (in strings but 
not bytes) should be rejected as invalid. For 3.x, the manual should specify 
that it follows the C definition of 'whitespace' (\r included) for bytes and 
the extended unicode definition for strings.

>>> int('3\r')
3
>>> int('3\u00a0')
3
>>> int('3\u2000')
3
>>> int(b'3\r')
3
>>> int(b'3\u00a0')
Traceback (most recent call last):
  File "<pyshell#10>", line 1, in <module>
    int(b'3\u00a0')
ValueError: invalid literal for int() with base 10: '3\\u00a0'

----------
nosy: +terry.reedy
type: behavior -> enhancement
versions: +Python 3.4

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18236>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to