>> I tried to use eval with/without unicode strings and it worked. Example: >> >> >>> eval( u'"徹底したコスト削減 ÁÍŰŐÜÖÚÓÉ трирова"' ) == eval( '"徹底し >> たコスト削減 ÁÍŰŐÜÖÚÓÉ трирова"' ) >> True >> > When you feed your unicode data into eval(), it doesn't have any > encoding or decoding work to do. >
Yes, but what about eval( 'u' + '"徹底したコスト削減 ÁÍŰŐÜÖÚÓÉ трирова"' ) The passed expression is not unicode. It is a "normal" string. A sequence of bytes. It will be evaluated by eval, and eval should know how to decode the byte sequence. Same way as the interpreter need to know the encoding of the file when it sees the u"徹底したコスト削減 ÁÍŰŐÜÖÚÓÉ трирова" byte sequence in a python source file - before creating the unicode instance, it needs to be decoded (or not, depending on the encoding of the source). String passed to eval IS python source, and it SHOULD have an encoding specified (well, unless it is already a unicode string, in that case this magic is not needed). Consider this: exec(""" import codecs s = u'Ű' codecs.open("test.txt","w+",encoding="UTF8").write(s) """) Facts: - source passed to exec is a normal string, not unicode - the variable "s", created inside the exec() call will be a unicode string. However, it may be Û or something else, depending on the source encoding. E.g. ASCII encoding it is invalid and exec() should raise a SyntaxError like: SyntaxError: Non-ASCII character '\xc5' in file c:\temp\aaa\test.py on line 1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details Well at least this is what I think. If I'm not right then please explain why. Thanks Laszlo -- http://mail.python.org/mailman/listinfo/python-list