[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-18 Thread STINNER Victor
STINNER Victor added the comment: Buildbots are green again (#10123 is closed). I ported the fix to Python 3.1 (r85716). Close this issue. -- resolution: -> fixed status: open -> closed ___ Python tracker __

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-16 Thread STINNER Victor
STINNER Victor added the comment: I created #10123 for the test_doctest regression. -- ___ Python tracker ___ ___ Python-bugs-list ma

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-16 Thread STINNER Victor
STINNER Victor added the comment: Commited to 3.2 (r85569+r85570). I wait for the buildbot before porting the patch to 3.1 and close the issue. There is already a regression on Gentoo buildbot with ascii locale encoding, test_doctest test_zipimport_support: http://www.python.org/dev/buildbot/

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-16 Thread STINNER Victor
STINNER Victor added the comment: Oh, I just realized that Python 3.1.2 (last Python 3.1 release) was released the 21st March, whereas r82063 (commit for #6543) was made the 17st June. So the encoding change was not released yet. -- ___ Python trac

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-16 Thread STINNER Victor
STINNER Victor added the comment: > Here is a new patch [code_encoding.patch] implementing this idea: > - Use filesystem encoding (and surrogateescape) to encode/decode > paths in compile() and the parser, instead of utf-8 in strict mode > (...) > The patch restores the situation before #6543.

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-15 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file19243/compile_surrogates.patch ___ Python tracker ___ ___ Python-bugs-lis

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-15 Thread STINNER Victor
STINNER Victor added the comment: Remove [compile_surrogates.patch] because it creates filenames unencode to the filesystem encoding. Eg. compile('', '\udcc3\udca9', 'exec').co_filename gives 'é' even if the filesystem encoding is 'ascii'. -- ___ P

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-15 Thread STINNER Victor
STINNER Victor added the comment: > I do not see what filesystem encodings, or any other encoding > to bytes should really have to do with the [code.co_filename]. co_filename attribute is used to display the traceback: Python opens the related file, read the source code line and display it. O

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-15 Thread Terry J. Reedy
Terry J. Reedy added the comment: Pardon my ignorance, but given that code.co_filename is a string attribute given as a string, which is to say, unicode in 3.x, I do not see what filesystem encodings, or any other encoding to bytes should really have to do with the attribute. I actually would

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-15 Thread STINNER Victor
STINNER Victor added the comment: > All filenames should use the filesystem encoding in Python. Here is a new patch [code_encoding.patch] implementing this idea: - Use filesystem encoding (and surrogateescape) to encode/decode paths in compile() and the parser, instead of utf-8 in strict mod

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-15 Thread STINNER Victor
STINNER Victor added the comment: See also #9713. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://ma

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-15 Thread STINNER Victor
STINNER Victor added the comment: #6543 changed code->co_filename encoding from filesystem encoding+surrogateescape to utf-8+strict. With my patch, compile('', '\udcc3\udca9', 'exec').co_filename gives 'é', it doesn't depend on the filesystem encoding. But 'é' cannot be used with all filesys

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-15 Thread Terry J. Reedy
Terry J. Reedy added the comment: I think the title is slightly misleading. As I read the patch, the issue is that PyArg_ParseTupleAndKeywords requires that string args to C functions be valid Unicode strings (and that it does this by trying to encode to utf-8). Your patch subverts this by re

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-15 Thread Antoine Pitrou
Changes by Antoine Pitrou : -- nosy: +benjamin.peterson ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://

[issue10114] compile() doesn't support the PEP 383 (surrogates)

2010-10-15 Thread STINNER Victor
New submission from STINNER Victor : Example: $ ./python Python 3.2a3+ (py3k, Oct 15 2010, 14:31:59) >>> compile('', 'abc\uDC80', 'exec') ... UnicodeEncodeError: 'utf-8' codec can't encode character '\udc80' in position 3: surrogates not allowed Attached patch encodes manually the filename to