Re: Python 3.1.1 bytes decode with replace bug

2009-10-26 Thread Joe
Thanks Mark, that is a great suggestion! > You can also replace the Unicode replacement character U+FFFD with a valid > cp437 character before displaying it: > > >>> b'\x80abc'.decode('utf8','replace').replace('\ufffd','?') > > '?abc' > -- http://mail.python.org/mailman/listinfo/python-list

Re: Python 3.1.1 bytes decode with replace bug

2009-10-26 Thread Joe
Thanks Benjamin for solving the mystery of where the cp437 usage was coming from. So b'\x80abc'.decode("utf-8", "replace") was working properly but then when the interactive prompt tried to print it, it was basically taking the results and doing a encode('cp437', 'strict') which failed because of

Re: Python 3.1.1 bytes decode with replace bug

2009-10-25 Thread Mark Tolonen
"Dave Angel" wrote in message news:4ae43150.9010...@ieee.org... Joe wrote: For the reason BK explained, the important difference is that I ran in the IDLE shell, which handles screen printing of unicode better ;-) Something still does not seem right here to me. In the example above the by

Re: Python 3.1.1 bytes decode with replace bug

2009-10-25 Thread Dave Angel
Joe wrote: For the reason BK explained, the important difference is that I ran in the IDLE shell, which handles screen printing of unicode better ;-) Something still does not seem right here to me. In the example above the bytes were decoded to 'UTF-8' with the *nope* you're decoding

Re: Python 3.1.1 bytes decode with replace bug

2009-10-24 Thread Benjamin Kaplan
On Sat, Oct 24, 2009 at 8:47 PM, Joe wrote: >> For the reason BK explained, the important difference is that I ran in >> the IDLE shell, which handles screen printing of unicode better ;-) > > Something still does not seem right here to me. > > In the example above the bytes were decoded to 'UTF-8

Re: Python 3.1.1 bytes decode with replace bug

2009-10-24 Thread Joe
> For the reason BK explained, the important difference is that I ran in > the IDLE shell, which handles screen printing of unicode better ;-) Something still does not seem right here to me. In the example above the bytes were decoded to 'UTF-8' with the replace option so any characters that were

Re: Python 3.1.1 bytes decode with replace bug

2009-10-24 Thread Terry Reedy
Joe wrote: Thanks for your response. Please provide more information The Python 3.1.1 documentation has the following example: Where? I could not find them http://docs.python.org/3.1/howto/unicode.html#unicode-howto Scroll down the page about half way to the "The String Type" section. Th

Re: Python 3.1.1 bytes decode with replace bug

2009-10-24 Thread Joe
Thanks for your response. > Please provide more information > > > The Python 3.1.1 documentation has the following example: > > Where? I could not find them http://docs.python.org/3.1/howto/unicode.html#unicode-howto Scroll down the page about half way to the "The String Type" section. The exa

Re: Python 3.1.1 bytes decode with replace bug

2009-10-24 Thread Benjamin Kaplan
On Sat, Oct 24, 2009 at 1:09 PM, Joe wrote: > The Python 3.1.1 documentation has the following example: > b'\x80abc'.decode("utf-8", "strict") > Traceback (most recent call last): >  File "", line 1, in ? > UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: >              

Re: Python 3.1.1 bytes decode with replace bug

2009-10-24 Thread Terry Reedy
Joe wrote: Please provide more information The Python 3.1.1 documentation has the following example: Where? I could not find them b'\x80abc'.decode("utf-8", "strict") Traceback (most recent call last): File "", line 1, in ? UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in posit

Python 3.1.1 bytes decode with replace bug

2009-10-24 Thread Joe
The Python 3.1.1 documentation has the following example: >>> b'\x80abc'.decode("utf-8", "strict") Traceback (most recent call last): File "", line 1, in ? UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: unexpected code byte >>> b'\x80abc'.decode("utf-8