On 2012-03-28, Prasad, Ramit <ramit.pra...@jpmorgan.com> wrote: > >>You can't generally just "deal with the ascii portions" without >>knowing something about the encoding. Say you encounter a byte >>greater than 127. Is it a single non-ASCII character, or is it the >>leading byte of a multi-byte character? If the next character is less >>than 127, is it an ASCII character, or a continuation of the previous >>character? For UTF-8 you could safely assume ASCII, but without >>knowing the encoding, there is no way to be sure. If you just assume >>it's ASCII and manipulate it as such, you could be messing up >>non-ASCII characters. > > Technically, ASCII goes up to 256
No, ASCII only defines 0-127. Values >=128 are not ASCII. >From https://en.wikipedia.org/wiki/ASCII: ASCII includes definitions for 128 characters: 33 are non-printing control characters (now mostly obsolete) that affect how text and space is processed and 95 printable characters, including the space (which is considered an invisible graphic). -- Grant Edwards grant.b.edwards Yow! Used staples are good at with SOY SAUCE! gmail.com -- http://mail.python.org/mailman/listinfo/python-list