On Fri, Nov 14, 2014 at 10:11 AM, Rick Johnson <rantingrickjohn...@gmail.com> wrote: > # The parse functions have no idea what to do with > # Unicode, so replace all Unicode characters with "x". > # This is "safe" so long as the only characters germane > # to parsing the structure of Python are 7-bit ASCII. > # It's *necessary* because Unicode strings don't have a > # .translate() method that supports deletechars.
Sounds to me like the functions that collapse whitespace to single spaces, or turn all letters into "A" and all digits into "9", or lowercase/casefold all alphabetics, or strip diacriticals, or anything else of that nature. It's often simpler to fold equivalencies together before parsing or comparing strings. It doesn't mean you don't respect Unicode; in fact, it proves that you *do*. So if you stop calling Unicode "vile", you might actually learn something. ChrisA -- https://mail.python.org/mailman/listinfo/python-list