On Tue, May 31, 2011 at 8:45 AM, Wolfgang Meiners <wolfgangmeiner...@web.de> wrote: > Am 31.05.11 13:32, schrieb Daniel Kluev: >> On Tue, May 31, 2011 at 8:40 AM, Wolfgang Meiners >> <wolfgangmeiner...@web.de> wrote: >>> metadata = MetaData('sqlite://') >>> a_table = Table('tf_lehrer', metadata, >>> Column('id', Integer, primary_key=True), >>> Column('Kuerzel', Text), >>> Column('Name', Text)) >> >> Use UnicodeText instead of Text. >> >>> A_record = A_class('BUM', 'Bäumer') >> >> If this is python2.x, use u'Bäumer' instead. >> >> > > Thank you Daniel. > So i came a little bit closer to the solution. Actually i dont write the > strings in a python program but i read them from a file, which is > utf8-encoded. > > So i changed the lines > > for line in open(file,'r'): > line = line.strip() > > first to > > for line in open(file,'r'): > line = unicode(line.strip()) > > and finally to > > for line in open(file,'r'): > line = unicode(line.strip(),'utf8') > > and now i get really utf8-strings. It does work but i dont know why it > works. For me it looks like i change an utf8-string to an utf8-string. >
There's no such thing as a UTF-8 string. You have a list of bytes (byte string) and you have a list of characters (unicode). UTF-8 is a function that can convert bytes into characters (and the reverse). You may recognize that the list of bytes was encoded using UTF-8 but the computer does not unless you explicitly tell it to. Does that help clear it up? -- http://mail.python.org/mailman/listinfo/python-list