Jarek Zgoda wrote: > Fredrik Lundh napisa³(a): > > >>UnicodeDecodeError: 'utf8' codec can't decode bytes in position 13-18: > >>unsupported Unicode code range > >> > >>does anyone have any idea on what could be going wrong? The string > >>that I store in the database table is: > >> > >>'Keinen Text für Übereinstimmungsfehler gefunden' > > > > $ more test.py > > # -*- coding: iso-8859-1 -*- > > u = u'Keinen Text für Übereinstimmungsfehler gefunden' > > s = u.encode("iso-8859-1") > > u = s.decode("utf-8") # <-- this gives an error > > > > $ python test.py > > Traceback (most recent call last): > > File "test.py", line 4, in ? > > u = s.decode("utf-8") # <-- this gives an error > > File "lib/encodings/utf_8.py", line 16, in decode > > return codecs.utf_8_decode(input, errors, True) > > UnicodeDecodeError: 'utf8' codec can't decode bytes in position 13-18: > > unsupported Unicode code range > > I cann't wait for the moment when encoded strings go away from Python. > The more I program in this language, the more confusion this difference > is causing. Now most of functions and various object's methods accept > strings and unicode, making it hard to find sources of Unicode*Errors.
Library writers can speed up the transition by hiding 8bit interface, for example: import sqlite sqlite.I_promise_to_pass_8bit_string_only_in_utf8_encoding(my_signature="sig.gif") if you don't call this function 8bit strings will not be accepted :) IMHO if libraries keep on excepting both str and unicode till python 3.0, it will just prolong the confusion of unicode newbies instead of guiding them in the right direction _right now_. -- http://mail.python.org/mailman/listinfo/python-list