John Machin <[EMAIL PROTECTED]> writes: > On Oct 21, 11:03 pm, Ben Finney <[EMAIL PROTECTED]> > wrote: > > John Machin <[EMAIL PROTECTED]> writes: > > > I don't understand the point or value of filtering out all byte values > > > greater than 127 > > > > That's only done if the encoding isn't otherwise specified. In which > > case, ASCII is the documented default encoding. In which case, it > > *must* be restricted to code points 0+IBM-127, otherwise it's not ASCII. > > > > The value of doing this is to make it rapidly and repeatably apparent > > when the programmer's assumptions about character encoding are false, > > allowing the programming error to be fixed early rather than late. > > "make it rapidly and repeatably apparent ..." is much better achieved > by raising an exception.
Ah, I misread; I thought you were asking about the value of defaulting to ASCII and therefore raising an exception. It seems we agree on that, then. > What is that 0+IBM-127 +IBw-guess+IB0- gibberish in your posting? It wasn't in my message as sent to my news server, nor as I read the message in comp.lang.python. The message was encoded using UTF-8. Perhaps it's since been munged in transit to your eyeballs by any of a number of intermediaries. -- \ “I bought some batteries, but they weren't included; so I had | `\ to buy them again.” —Steven Wright | _o__) | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list