Duncan Booth schrieb: > Fredrik Lundh <[EMAIL PROTECTED]> wrote: > >> ET has already decoded the CP1252 data for you. If you want UTF-8, all >> you need to do is to encode it: >> >>>>> u'Bob\x92s Breakfast'.encode('utf8') >> 'Bob\xc2\x92s Breakfast' >> > I think he is claiming that the encoding information in the file is > incorrect and therefore it has been decoded incorrectly. > > I would think it more likely that he wants to end up with u'Bob\u2019s > Breakfast' rather than u'Bob\x92s Breakfast' although u'Dog\u2019s dinner' > seems a probable consequence.
If that's the case, he should read the file as string, de- and encode it (probably into a StringIO) and then feed it to the parser. Diez -- http://mail.python.org/mailman/listinfo/python-list