Joe wrote:

Please provide more information

The Python 3.1.1 documentation has the following example:

Where? I could not find them

b'\x80abc'.decode("utf-8", "strict")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0:
                    unexpected code byte
b'\x80abc'.decode("utf-8", "replace")
'\ufffdabc'
b'\x80abc'.decode("utf-8", "ignore")
'abc'

Strict and Ignore appear to work as per the documentation but replace
does not.  Instead of replacing the values it fails:

b'\x80abc'.decode('utf-8', 'replace')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "p:\SW64\Python.3.1.1\lib\encodings\cp437.py", line 19, in
encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in
position
1: character maps to <undefined>

Which interpreter and system? With Python 3.1 (r31:73574, Jun 26 2009, 20:21:35) [MSC v.1500 32 bit (Intel)] on win32, IDLE, I get

>>> b'\x80abc'.decode('utf-8', 'replace') # pasted from above
'�abc'

as per the example.

If this a known bug with 3.1.1?

Do you do a search in the issues list at bugs.python.org?
I did and did not find anything. The discrepancy between doc (if the example really is from the doc) and behavior (if really 3.1) would be a bug, but more info is needed.

Terry Jan Reedy


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to