On Mon, Jan 26, 2009 at 3:21 PM, Hans Müller <heint...@web.de> wrote:
> Hi python experts, > > in the moment I'm struggling with an annoying problem in conjunction with > mysql. > > I'm fetching rows from a database, which the mysql drive returns as a list > of tuples. > > The default coding of the database is utf-8. > > Unfortunately in the database there are rows with different codings and > there is a blob > column. > > In the app. I search for double entries in the database with this code. > > hash = {} > cursor.execute("select * from table") > rows = cursor.fetchall() > for row in rows: > key = "|".join([str(x) for x in row]) <- here the problem > arises > if key in hash: > print "found double entry" > > This code works as expected with python 2.5.2 > With 2.5.1 it shows this error: > > > key = "|".join(str(x) for x in row) > UnicodeEncodeError: 'ascii' codec can't encode character u'\u017e' in > position 3: ordinal > not in range(128) > > When I replace the str() call by unicode(), I get this error when a blob > column is being > processed: > > key = "|".join(unicode(x) for x in row) > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 119: > ordinal not in > range(128) > > > Please help, how can I convert ANY column data to a string which is usable > as a key to a > dictionary. The purpose of using a dictionary is to find equal rows in some > database > tables. Perhaps using a md5 hash from the column data is also an idea ? unicode takes an optional encoding argument. If you don't specify, it uses ascii. Try using (untested): key = u"|".join(unicode(x, encoding="utf-8") for x in row) > Thanks a lot in advance, > > Hans. > -- > http://mail.python.org/mailman/listinfo/python-list >
-- http://mail.python.org/mailman/listinfo/python-list