I'm getting really annoyed with python in regards to unicode/ascii-encoding problems.
The string below is the encoding of the norwegian word "fødselsdag". >>> s = 'f\xc3\x83\xc2\xb8dselsdag' I stored the string as "fødselsdag" but somewhere in my code it got translated into the mess above and I cannot get the original string back. It cannot be printed in the console or written a plain text-file. I've tried to convert it using >>> s.encode('iso-8859-1') Traceback (most recent call last): File "<interactive input>", line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128) >>> s.encode('utf-8') Traceback (most recent call last): File "<interactive input>", line 1, in ? UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128) And nothing helps. I cannot remember hacing these problems in earlier versions of python and it's really annoying, even if it's my own fault somehow, handling of normal characters like this shouldn't cause this much hassle. Searching google for "codec can't decode byte" and UnicodeDecodeError etc. produces a bunch of hits so it's obvious I'm not alone. Any hints? -- http://mail.python.org/mailman/listinfo/python-list