* Charles Plessy <ple...@debian.org>, 2010-08-18, 09:29:
I encounder a character set problem: some of the contents of the yaml file are encoded in UTF-8, and the UDD is ASCII:ple...@sd-13492:/org/udd.debian.org/udd$ ./update-and-run.sh bibref Unable to inject data for package adun.app. 'ascii' codec can't encode character u'\xe1' in position 39: ordinal not in range(128) --> ['adun.app', 'Reference-Author', u'Michael A. Johnston, Ignacio Fdez. Galv\xe1n and Jordi Vill\xe0-Freixa'] Unable to inject data for package rnahybrid. 'ascii' codec can't encode character u'\xd6' in position 41: ordinal not in range(128) --> ['rnahybrid', 'Reference-Author', u'REHMSMEIER, MARC and STEFFEN, PETER and H\xd6CHSMANN, MATTHIAS and GIEGERICH, ROBERT'] Unable to inject data for package melting. 'ascii' codec can't encode character u'\xe8' in position 6: ordinal not in range(128) --> ['melting', 'Reference-Author', u'Le Nov\xe8re, Nicolas'] Unable to inject data for package t-coffee. 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128) --> ['t-coffee', 'Reference-Author', u'C\xe9dric Notredame and Desmond G. Higgins and Jaap Heringa'] To solve the problem, I am trying to use the unicode function like in the following thread: http://lists.debian.org/20090522140048.ga6...@an3as.eu Unfortunatly, changes like below have no effect: Index: udd/bibref_gatherer.py =================================================================== --- udd/bibref_gatherer.py (révision 1777) +++ udd/bibref_gatherer.py (copie de travail) @@ -49,7 +49,7 @@ package, key, value = res query = "EXECUTE bibref_insert (%s, %s, %s)" try: - cur.execute(query, (package, key, value)) + cur.execute(query, (package, key, unicode(str(value), 'utf-8'))) except UnicodeEncodeError, err: print >>stderr, "Unable to inject data for package %s. %s" % (package, err) print >>stderr, "-->", res
Just a wild guess: values coming from yaml are Unicode strings, whereas database backend expects byte strings. If this is the case,
cur.execute(query, (package, key, value.encode('utf-8'))) should work. -- Jakub Wilk
signature.asc
Description: Digital signature