STINNER Victor <[EMAIL PROTECTED]> added the comment: Exemple of the problem: exec('#header\n# encoding: ISO-8859-1\nprint("h\xe9 h\xe9")\n')
exec(unicode) calls source_as_string() which converts unicode to bytes using _PyUnicode_AsDefaultEncodedString() (UTF-8 charset). Then PyRun_StringFlags() is called with the UTF-8 byte string with PyCF_SOURCE_IS_UTF8 flag. But in the parser, get_coding_spec() recognize the "#coding:" header and convert bytes to unicode using the specified charset (which may be different than UTF-8). The problem is in the function PyAST_FromNode(): the flag in not used in the tokenizer but only in the AST parser. I also see: if (flags && flags->cf_flags & PyCF_SOURCE_IS_UTF8) { c.c_encoding = "utf-8"; if (TYPE(n) == encoding_decl) { #if 0 ast_error(n, "encoding declaration in Unicode string"); goto error; #endif n = CHILD(n, 0); } } else if (TYPE(n) == encoding_decl) { c.c_encoding = STR(n); n = CHILD(n, 0); } else { /* PEP 3120 */ c.c_encoding = "utf-8"; } The ast_error() may be uncommented. _______________________________________ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue4282> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com