[issue19846] Setting LANG=C breaks Python 3 on Linux

2013-12-09 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 09.12.2013 11:19, STINNER Victor wrote: > > STINNER Victor added the comment: > > Marc-Andre> AFAIK, Python 3 does work with ASCII data in the C locale, so I'm > not sure whether this is a bug at all. > > What do you mean? Python uses the surrogateesca

[issue19846] Setting LANG=C breaks Python 3 on Linux

2013-12-09 Thread STINNER Victor
STINNER Victor added the comment: Marc-Andre> AFAIK, Python 3 does work with ASCII data in the C locale, so I'm not sure whether this is a bug at all. What do you mean? Python uses the surrogateescape encoding since Python 3.1, undecodable bytes are stored as surrogate characters. Many bugs r

[issue19846] Setting LANG=C breaks Python 3 on Linux

2013-12-09 Thread STINNER Victor
STINNER Victor added the comment: Nick> testing applications for POSIX compliance Sorry but what do you mean by "POSIX compliance"? The POSIX standard only specify the ASCII encoding. http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html "The tables in Locale Definition descr

[issue19846] Setting LANG=C breaks Python 3 on Linux

2013-12-09 Thread STINNER Victor
STINNER Victor added the comment: I didn't understand Serhiy's "ls" example. I tried: $ mkdir unicode $ cd unicode $ python3 -c 'open("ab\xe9.txt", "w").close()' $ python3 -c 'open("euro\u20ac.txt", "w").close()' $ ls abé.txt euro€.txt $ LANG=C ls ab??.txt euro???.txt Ah yes, I didn't rememb

[issue19846] Setting LANG=C breaks Python 3 on Linux

2013-12-09 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: The "C" locale is part of the ANSI C standard. The "POSIX" locale is an alias for the "C" locale and a POSIX standard, so we cannot just replace the ASCII encoding with UTF-8 as we wish, so Antoine's patch won't work. See e.g. http://pubs.opengroup.org/onl

[issue19846] Setting LANG=C breaks Python 3 on Linux

2013-12-09 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > sworddragon@ubuntu:~$ LANG=C > sworddragon@ubuntu:~$ ä > bash: $'\303\244': command not found > > - The terminal doesn't pseudo-crash with an exception because it doesn't > matter about encodings. - It allows to change the encoding at runtime. This is not a

[issue19846] Setting LANG=C breaks Python 3 on Linux

2013-12-09 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > And yet, in Python 2, people could do that, and Python didn't care. > *That's* the regression I'm worried about. If it hadn't round-tripped > cleanly in Python 2, I wouldn't care here either. $ python2.7 -c "print u'\u20ac'" € $ LANG=C python2.7 -c "print u'

[issue19846] Setting LANG=C breaks Python 3 on Linux

2013-12-08 Thread Sworddragon
Sworddragon added the comment: You should keep things more simple: - Python and the operation system/filesystem are in a client-server relationship and Python should validate all. - It doesn't matter what you will finally decide to be the default encoding on various places - all will provide r

[issue19846] Setting LANG=C breaks Python 3 on Linux

2013-12-08 Thread Nick Coghlan
Nick Coghlan added the comment: On 9 December 2013 12:08, STINNER Victor wrote: > > STINNER Victor added the comment: > >> End users tripping over this by setting LANG=C is one of the pain points of >> Python 3 relative to Python 2 for Fedora, so I've added a couple of Fedora >> folks to the n

[issue19846] Setting LANG=C breaks Python 3 on Linux

2013-12-08 Thread STINNER Victor
STINNER Victor added the comment: > End users tripping over this by setting LANG=C is one of the pain points of > Python 3 relative to Python 2 for Fedora, so I've added a couple of Fedora > folks to the nosy list. Sorry, I'm not aware of such issue. Do you have examples? > - the main problem

[issue19846] Setting LANG=C breaks Python 3 on Linux

2013-12-08 Thread Nick Coghlan
Nick Coghlan added the comment: End users tripping over this by setting LANG=C is one of the pain points of Python 3 relative to Python 2 for Fedora, so I've added a couple of Fedora folks to the nosy list. My current understanding of the situation: - we should leave Windows and Mac OS X alon

[issue19846] Setting LANG=C breaks Python 3

2013-12-08 Thread STINNER Victor
STINNER Victor added the comment: > It seems there is more work to do to get this right, but I'm not > terribly interested either. Feel free to take over. If you are talking to me: I'm currently opposed to change anything, so I'm not interested to work on a patch. IMO Python works fine and you

[issue19846] Setting LANG=C breaks Python 3

2013-12-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: On dim., 2013-12-08 at 22:22 +, STINNER Victor wrote: > (b) for technical reasons, Python reuses the C codec during Python > initialization to decode and encode OS data, and so currently Python > *must* use the locale encoding for its "filesystem encoding" A

[issue19846] Setting LANG=C breaks Python 3

2013-12-08 Thread STINNER Victor
STINNER Victor added the comment: >> Or said differently, the filesystem encoding is different than the >> locale encoding. > Indeed, but the FS encoding and the IO encoding are the same. > "locale encoding" doesn't really matter here, as we are assuming that > it's wrong. Oh, I realized that "

[issue19846] Setting LANG=C breaks Python 3

2013-12-08 Thread STINNER Victor
Changes by STINNER Victor : -- title: print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding() -> Setting LANG=C breaks Python 3 ___ Python tracker ___

[issue19846] Setting LANG=C breaks Python 3

2013-12-08 Thread Nick Coghlan
Changes by Nick Coghlan : -- title: print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding() -> Setting LANG=C breaks Python 3 ___ Python tracker _