Joe wrote:
Please provide more information
The Python 3.1.1 documentation has the following example:
Where? I could not find them
b'\x80abc'.decode("utf-8", "strict")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0:
unexpected code byte
b'\x80abc'.decode("utf-8", "replace")
'\ufffdabc'
b'\x80abc'.decode("utf-8", "ignore")
'abc'
Strict and Ignore appear to work as per the documentation but replace
does not. Instead of replacing the values it fails:
b'\x80abc'.decode('utf-8', 'replace')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "p:\SW64\Python.3.1.1\lib\encodings\cp437.py", line 19, in
encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in
position
1: character maps to <undefined>
Which interpreter and system? With Python 3.1 (r31:73574, Jun 26 2009,
20:21:35) [MSC v.1500 32 bit (Intel)] on win32, IDLE, I get
>>> b'\x80abc'.decode('utf-8', 'replace') # pasted from above
'�abc'
as per the example.
If this a known bug with 3.1.1?
Do you do a search in the issues list at bugs.python.org?
I did and did not find anything. The discrepancy between doc (if the
example really is from the doc) and behavior (if really 3.1) would be a
bug, but more info is needed.
Terry Jan Reedy
--
http://mail.python.org/mailman/listinfo/python-list