STINNER Victor <victor.stin...@haypocalc.com> added the comment: FYI, you should use ascii() instead of a.encode(\"utf8\") to dump arguments. It's easier to check '\u2603' than b'\xe2\x98\x83' for me :-)
So the bug is fixed in Python 3.2, great! I was thinking that we need a test for that, but then I remembered that I already wrote such test :-) My test checks 3 unicode characters: \xe9, \u20ac, \U0010ffff; but also invalid byte sequences: text = ( b'\xff' # invalid byte b'\xc3\xa9' # valid utf-8 character b'\xc3\xff' # invalid byte sequence b'\xed\xa0\x80' # lone surrogate character (invalid) ) And it should be enough :-) See test_osx_utf8() of test_cmd_line to see the whole test. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue9167> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com