The Python 3.1.1 documentation has the following example: >>> b'\x80abc'.decode("utf-8", "strict") Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: unexpected code byte >>> b'\x80abc'.decode("utf-8", "replace") '\ufffdabc' >>> b'\x80abc'.decode("utf-8", "ignore") 'abc'
Strict and Ignore appear to work as per the documentation but replace does not. Instead of replacing the values it fails: >>> b'\x80abc'.decode('utf-8', 'replace') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "p:\SW64\Python.3.1.1\lib\encodings\cp437.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_map)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 1: character maps to <undefined> If this a known bug with 3.1.1? -- http://mail.python.org/mailman/listinfo/python-list