Simon Willison <[EMAIL PROTECTED]> wrote: > How can I tell Python "I know this says it's a unicode string, but I > need you to treat it like a bytestring"?
Can you not just fix your xml file so that it uses the same encoding as it claims to use? If the xml says it contains utf8 encoded data then it should not contain cp1252 encoded data, period. If you really must, then try encoding with latin1 and then decoding with cp1252: >>> print u'Bob\x92s Breakfast'.encode('latin1').decode('cp1252') Bobs Breakfast The latin1 codec will convert unicode characters in the range 0-255 to the same single-byte value. -- http://mail.python.org/mailman/listinfo/python-list