[issue7649] "u'%c' % char" broken for chars in range '\x80'-'\xFF'

Ezio Melotti Thu, 25 Feb 2010 06:53:21 -0800

Ezio Melotti <ezio.melo...@gmail.com> added the comment:

The latest patch (issue7649v4.diff) checks if the char is ASCII or non-ASCII 
and then, if the char is ASCII, it converts it directly to Unicode, otherwise 
it tries to decode it using the default encoding, raising a UnicodeDecodeError 
if the decoding fails.


I tested it setting iso-8859-1 and utf-8 as default encoding and the behavior 
was consistent with "%s", however the tests assume that the default encoding is 
always ASCII, so they failed (both the tests included in the patch and others 
in test_unicode). I'm not sure if in this case they should be changed/skipped 
or not.

(Also http://docs.python.org/c-api/unicode.html#built-in-codecs says that 
"Setting encoding to NULL causes the default encoding to be used which is 
ASCII.", but this is not always true. If you think it should be fixed I'll do 
it in a separate commit.)

----------
stage: committed/rejected -> patch review
Added file: http://bugs.python.org/file16369/issue7649v4.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue7649>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7649] "u'%c' % char" broken for chars in range '\x80'-'\xFF'

Reply via email to