On Mon, Dec 20, 2010 at 2:08 PM, Martin Hvidberg <mar...@hvidberg.net> wrote: > Question: > In the last printout, tagged >InReturLst> all entries turn into uni-code. > What happens here?
Actually, they were all unicode to begin with. You're using codecs.open to read the file, which transparently decodes the data using the supplied encoding (in this case, utf-8). If you wanted to preserve the original bytes, you would just use the open() function to open the file instead. > Look for the word 'FANØ'. This word changes from 'FANØ' to u'FAN\xd8' – > That's a problem to me, and I don't want it to change like this. This happens because you're printing a list instead of a unicode string. When you print the unicode string, it tries to print the actual characters. When you print the list, it constructs the repr of the list, which uses the repr of each of the items in the list, and the repr of the unicode string is u'FAN\xd8'. If you don't want this to happen, then you will need to format the list as a string yourself instead of relying on print to do what it thinks you might want. Cheers, Ian -- http://mail.python.org/mailman/listinfo/python-list