Unicode/ascii encoding nightmare

Thomas W Mon, 06 Nov 2006 11:56:00 -0800

I'm getting really annoyed with python in regards to
unicode/ascii-encoding problems.


The string below is the encoding of the norwegian word "fødselsdag".

>>> s = 'f\xc3\x83\xc2\xb8dselsdag'

I stored the string as "fødselsdag" but somewhere in my code it got
translated into the mess above and I cannot get the original string
back. It cannot be printed in the console or written a plain text-file.
I've tried to convert it using

>>> s.encode('iso-8859-1')
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1:
ordinal not in range(128)

>>> s.encode('utf-8')
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1:
ordinal not in range(128)

And nothing helps. I cannot remember hacing these problems in earlier
versions of python and it's really annoying, even if it's my own fault
somehow, handling of normal characters like this shouldn't cause this
much hassle. Searching google for "codec can't decode byte" and
UnicodeDecodeError etc. produces a bunch of hits so it's obvious I'm
not alone.

Any hints?

-- 
http://mail.python.org/mailman/listinfo/python-list

Unicode/ascii encoding nightmare

Reply via email to