Hi Jonathan,
I think I made it too complicated and I did not concentrate on the question. I could write answers to your post, but I'm going to explain it formally: >>> s = '\xdb' # This is a byte, without encoding specified. >>> s.decode('latin1') u'\xdb' # The above byte decoded in latin1 encoding >>> s.decode('latin2') u'\u0170' # The same byte decoded in latin2 encoding >>> expr = 'u"' + s + '"' # Create an expression for eval >>> expr 'u"\xdb"' # expr is not a unicode string - it is a binary string and it has no encoding assigned. >>> print repr(eval(expr)) # Eval it u'\xdb' # What? Why it was decoded as 'latin1'? Why not 'latin2'? Why not 'ascii'? >>> eval( "# -*- coding: latin2 -*-\n" + expr) u'\u0170' # You can specify the encoding for eval, that is cool. I hope it is clear now. Inside eval, an unicode object was created from a binary string. I just discovered that PEP 0263 can be used to specify source encoding for eval. But still there is a problem: eval should not assume that the expression is in any particular encoding. When it sees something like '\xdb' then it should raise a SyntaxError - same error that you should get when running a .py file containing the same expression: >>> file('test.py','wb+').write(expr + "\n") >>> ^D [EMAIL PROTECTED]:~$ python test.py File "test.py", line 1 SyntaxError: Non-ASCII character '\xdb' in file test.py on line 1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Otherwise the interpretation of the expression will be ambiguous. If there is any good reason why eval assumed a particular encoding in the above example? Sorry for my misunderstanding - my English is not perfect. I hope it is clear now. My problem is solved anyway. Anytime I need to eval an expression, I'm going to specify the encoding manually with # -*- coding: XXX -*-. It is good to know that it works for eval and its counterparts. And it is unambiguous. :-) Best, Laszlo -- http://mail.python.org/mailman/listinfo/python-list