STINNER Victor <victor.stin...@haypocalc.com> added the comment: Le vendredi 24 décembre 2010 à 14:46 +0000, Baptiste Carvello a écrit : > the patch solves the bug for me as well (using locale "C", the > filesystem encoding is utf-8). However, I do not understand why the > patch checks that the shebang line decodes with both utf-8 and the > file's encoding. The shebang line is only used by the kernel to locate > the interpreter, so none of these should matter. Or have I misuderstood > the patch?
The shebang is read by 3 different functions: a) the shell reads the first line: if it starts with "#!", it's a shebang: read the command and options and execute it b) Python searchs a "#cookie:xxx" pattern in the first or the second line using a binary parser c) Python reads the file using the Python encoding: encoding written in the #coding:xxx header or UTF-8 by default (a) The shell reads the file as a binary file, it doesn't care of the encoding. It reads byte strings and pass them to the kernel. (b) The parser starts with the default encoding, UTF-8. Even if the file encoding is not UTF-8, all lines (Python only checks the cookie in the first or the second line) before #coding:xxx cookie are read in UTF-8. The shebang have to be written to the first line, so the cookie cannot be written before the shebang => the shebang have to be decodable from UTF-8 (b) If the file encoding is not UTF-8, a #cookie:xxx is used and the whole file (including the shebang) have to be decodable from this encoding => the shebang have to be decodable from the file encoding So the shebang have to be decodable from UTF-8 and from the file encoding. I should maybe add a comment about that in the patch. Example of (b) issue: --- $ ./build/scripts-3.2/2to3 File "./build/scripts-3.2/2to3", line 1 SyntaxError: Non-UTF-8 code starting with '\xff' in file ./build/scripts-3.2/2to3 on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details --- The shebang is b'#!/home/haypo/tmp/py3k\xff/bin/python3.2\n', my locale encoding is UTF-8 and the file encoding has no encoding cookie (it is encoded to UTF-8). -- copy_script.patch fixes an issue if the configure prefix is not ASCII (especially if the prefix is not decodable from UTF-8). ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6011> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com