John Machin <[EMAIL PROTECTED]> writes:

> I don't understand the point or value of filtering out all byte values
> greater than 127

That's only done if the encoding isn't otherwise specified. In which
case, ASCII is the documented default encoding. In which case, it
*must* be restricted to code points 0–127, otherwise it's not ASCII.

The value of doing this is to make it rapidly and repeatably apparent
when the programmer's assumptions about character encoding are false,
allowing the programming error to be fixed early rather than late.
This is, in my estimation, of more value than heuristic magic to
“guess” the encoding, and the resultant debugging nightmare when
that guesswork fails in unpredictable ways later in the program's
life.

-- 
 \         “My girlfriend has a queen sized bed; I have a court jester |
  `\   sized bed. It's red and green and has bells on it, and the ends |
_o__)                                         curl up.” —Steven Wright |
Ben Finney
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to