Am Mittwoch, 10. Januar 2007 23:18 schrieb José Matos:
> On Wednesday 10 January 2007 9:33 pm, Georg Baum wrote:
> > Ah, now I know the problem: If we add string literals to document.body 
we
> > need to prefix them with u to get unicode string literals: u'bla'. Now 
I
> > know where to search.
> 
>   That is enough to drive anyone (read me) crazy. :-)

Me too, but I found a workaround:

# Unfortunately we have a mixture of unciode strings and plain strings,
# because we never use u'xxx' for string literals, but 'xxx'.
# Therefore we may have to try two times to normalize the data.
try:
    document.body[i] = unicodedata.normalize("NFKD", document.body[i])
except TypeError:
    document.body[i] = unicodedata.normalize("NFKD", 
unicode(document.body[i], 'utf-8'))

That works, now I have to find the next bug :-(


Georg

Reply via email to