STINNER Victor <victor.stin...@haypocalc.com> added the comment: The problem is not specific to Py_CompileString(): all functions based (indirectly) on PyParser_ASTFromString() and PyParser_ASTFromFile() expect filenames encoded in utf-8 with the strict error handler.
If we choose to use something else than utf-8 in strict mode, here is an incomplete list of functions that have to be patched: - parser: * initerr() * err_input() - ast * ast_error_finish() And the list of impacted functions (parsing functions accepting filenames): - PyParser_ParseStringFlagsFilename() - PyParser_ParseFile*() - PyParser_ASTFromString(), PyParser_ASTFromFile() - PyAST_FromNode() - PyRun_SimpleFile*() - PyRun_AnyFile*() - PyRun_InteractiveOneFlags() - etc. All these functions are public and I don't think that it would be a good idea to change the encoding (eg. to iso-8859-1). We can use a different error handler (especially surrogateespace, as suggested in the initial message) and/or create new functions accepting unicode filenames. -- I'm working on undecodable filenames in issues #8611 and #9425, especially on the import machinery part. When the import machinery will be fully unicode compliant, the last part will be the "parser machinery" (Parser/*.c). It is a little bit more complex to patch the parser because there is the bootstrap problem: the parser is compiled twice, once with a small subset of the C Python API (using some mockups), once with the full API. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue9713> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com