On Mon, Oct 20, 2008 at 12:44 PM, est <[EMAIL PROTECTED]> wrote: > On Oct 20, 11:46 pm, Steven D'Aprano <[EMAIL PROTECTED] > cybersource.com.au> wrote: > > On Mon, 20 Oct 2008 06:30:09 -0700, est wrote: > > > Like I said, str() should NOT throw an exception BY DESIGN, it's a > basic > > > language standard. > > > > int() is also a basic language standard, but it is perfectly acceptable > > for int() to raise an exception if you ask it to convert something into > > an integer that can't be converted: > > > > int("cat") > > > > What else would you expect int() to do but raise an exception? > > > > If you ask str() to convert something into a string which can't be > > converted, then what else should it do other than raise an exception? > > Whatever answer you give, somebody else will argue it should do another > > thing. Maybe I want failed characters replaced with '?'. Maybe Fred wants > > failed characters deleted altogether. Susan wants UTF-16. George wants > > Latin-1. > > > > The simple fact is that there is no 1:1 mapping from all 65,000+ Unicode > > characters to the 256 bytes used by byte strings, so there *must* be an > > encoding, otherwise you don't know which characters map to which bytes. > > > > ASCII has the advantage of being the lowest common denominator. Perhaps > > it doesn't make too many people very happy, but it makes everyone equally > > unhappy. > > > > > str() is not only a convert to string function, but > > > also a serialization in most cases.(e.g. socket) My simple suggestion > > > is: If it's a unicode character, output as UTF-8; > > > > Why UTF-8? That will never do. I want it output as UCS-4. > > > > > other wise just ouput > > > byte array, please do not encode it with really stupid range(128) > ASCII. > > > It's not guessing, it's totally wrong. > > > > If you start with a byte string, you can always get a byte string: > > > > >>> s = '\x96 \xa0 \xaa' # not ASCII characters > > >>> s > > '\x96 \xa0 \xaa' > > >>> str(s) > > > > '\x96 \xa0 \xaa' > > > > -- > > Steven > > In fact Python handles characters well than most other open-source > programming languages. But still: > > 1. You can explain str() in 1000 ways, there are 1001 more confusing > error on all kinds of python apps. (Not only some of the scripts I've > written, but also famous enough apps like Boa Constructor > http://i36.tinypic.com/1gqekh.jpg. This sucks hard, right?) > > > 2. Anyone please kindly tell me how can I define a customized encoding > (namely 'ansi') which handles range(256) so I can > sys.setdefaultencoding('ansi') once and for all? > -- > http://mail.python.org/mailman/listinfo/python-list >
There is no such thing as the "ansi" encoding. The only encoding defined by the American National Standards Institute is the 7-bit ASCII encoding that Python uses by default. You are probably thinking of cp-1252, the Windows Western European code page, which isn't actually an ANSI standard.
-- http://mail.python.org/mailman/listinfo/python-list