On Feb 25, 12:00 am, Peter Otten <__pete...@web.de> wrote: > John Machin wrote:
> > Your Python 2.x code should be TESTED before you poke 2to3 at it. In > > this case just trying to run or import the offending code file would > > have given an informative syntax error (you have declared the .py file > > to be encoded in UTF-8 but it's not). > > The problem is that Python 2.x accepts arbitrary bytes in string constants. Ummm ... isn't that a bug? According to section 2.1.4 of the Python 2.7.1 Language Reference Manual: """The encoding is used for all lexical analysis, in particular to find the end of a string, and to interpret the contents of Unicode literals. String literals are converted to Unicode for syntactical analysis, then converted back to their original encoding before interpretation starts ...""" How do you reconcile "used for all lexical analysis" and "String literals are converted to Unicode for syntactical analysis" with the actual (astonishing to me) behaviour? -- http://mail.python.org/mailman/listinfo/python-list