On Mon, 20 Oct 2008 06:30:09 -0700, est wrote: > Like I said, str() should NOT throw an exception BY DESIGN, it's a basic > language standard.
int() is also a basic language standard, but it is perfectly acceptable for int() to raise an exception if you ask it to convert something into an integer that can't be converted: int("cat") What else would you expect int() to do but raise an exception? If you ask str() to convert something into a string which can't be converted, then what else should it do other than raise an exception? Whatever answer you give, somebody else will argue it should do another thing. Maybe I want failed characters replaced with '?'. Maybe Fred wants failed characters deleted altogether. Susan wants UTF-16. George wants Latin-1. The simple fact is that there is no 1:1 mapping from all 65,000+ Unicode characters to the 256 bytes used by byte strings, so there *must* be an encoding, otherwise you don't know which characters map to which bytes. ASCII has the advantage of being the lowest common denominator. Perhaps it doesn't make too many people very happy, but it makes everyone equally unhappy. > str() is not only a convert to string function, but > also a serialization in most cases.(e.g. socket) My simple suggestion > is: If it's a unicode character, output as UTF-8; Why UTF-8? That will never do. I want it output as UCS-4. > other wise just ouput > byte array, please do not encode it with really stupid range(128) ASCII. > It's not guessing, it's totally wrong. If you start with a byte string, you can always get a byte string: >>> s = '\x96 \xa0 \xaa' # not ASCII characters >>> s '\x96 \xa0 \xaa' >>> str(s) '\x96 \xa0 \xaa' -- Steven -- http://mail.python.org/mailman/listinfo/python-list