On Mar 21, 1:54 am, Laszlo Nagy <[EMAIL PROTECTED]> wrote: > >>> eval( "# -*- coding: latin2 -*-\n" + expr) > u'\u0170' # You can specify the encoding for eval, that is cool. >
I didn't think of that. That's pretty cool. > I hope it is clear now. Inside eval, an unicode object was created from > a binary string. I just discovered that PEP 0263 can be used to specify > source encoding for eval. But still there is a problem: eval should not > assume that the expression is in any particular encoding. When it sees > something like '\xdb' then it should raise a SyntaxError - same error > that you should get when running a .py file containing the same expression: > > >>> file('test.py','wb+').write(expr + "\n") > >>> ^D > [EMAIL PROTECTED]:~$ python test.py > File "test.py", line 1 > SyntaxError: Non-ASCII character '\xdb' in file test.py on line 1, but > no encoding declared; seehttp://www.python.org/peps/pep-0263.htmlfor > details > > Otherwise the interpretation of the expression will be ambiguous. If > there is any good reason why eval assumed a particular encoding in the > above example? > I'm not sure, but being in a terminal session means a lot can be inferred about what encoding a stream of bytes is in. I don't know off the top of my head where this would be stored or how Python tries to figure it out. > > My problem is solved anyway. Anytime I need to eval an expression, I'm > going to specify the encoding manually with # -*- coding: XXX -*-. It is > good to know that it works for eval and its counterparts. And it is > unambiguous. :-) > I would personally adopt the Py3k convention and work with text as unicode and bytes as byte strings. That is, you should pass in a unicode string every time to eval, and never a byte string. -- http://mail.python.org/mailman/listinfo/python-list