John Machin <[EMAIL PROTECTED]> writes: > I don't understand the point or value of filtering out all byte values > greater than 127
That's only done if the encoding isn't otherwise specified. In which case, ASCII is the documented default encoding. In which case, it *must* be restricted to code points 0–127, otherwise it's not ASCII. The value of doing this is to make it rapidly and repeatably apparent when the programmer's assumptions about character encoding are false, allowing the programming error to be fixed early rather than late. This is, in my estimation, of more value than heuristic magic to “guess” the encoding, and the resultant debugging nightmare when that guesswork fails in unpredictable ways later in the program's life. -- \ “My girlfriend has a queen sized bed; I have a court jester | `\ sized bed. It's red and green and has bells on it, and the ends | _o__) curl up.” —Steven Wright | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list