Anton Vredegoor wrote: >> So if that is the case: What is the problem then? If you interpret >> the document as cp1252, and it contains \x93 and \x94, what is >> it that you don't like about that? In yet other words: what actions >> are you performing, what are the results you expect to get, and >> what are the results that you actually get? > > Well, where do these cp1252 codes come from? The xml-file claims it's > utf-8.
Ah. Then the document is most likely right: \x94 can very well occur in an UTF-8 file. > I just tried out some random decodings and cp1252 seemed to work. I > don't like to have to guess this way. I think John wouldn't even allow > it :-) Well, if the document is UTF-8, you should decode it as UTF-8, of course. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list