Re: "convert" string to bytes without changing data (encoding)

Grant Edwards Wed, 28 Mar 2012 12:58:46 -0700

On 2012-03-28, Prasad, Ramit <ramit.pra...@jpmorgan.com> wrote:
> 
>>You can't generally just "deal with the ascii portions" without
>>knowing something about the encoding.  Say you encounter a byte
>>greater than 127.  Is it a single non-ASCII character, or is it the
>>leading byte of a multi-byte character?  If the next character is less
>>than 127, is it an ASCII character, or a continuation of the previous
>>character?  For UTF-8 you could safely assume ASCII, but without
>>knowing the encoding, there is no way to be sure.  If you just assume
>>it's ASCII and manipulate it as such, you could be messing up
>>non-ASCII characters.
> 
> Technically, ASCII goes up to 256


No, ASCII only defines 0-127.  Values >=128 are not ASCII.

>From https://en.wikipedia.org/wiki/ASCII:

  ASCII includes definitions for 128 characters: 33 are non-printing
  control characters (now mostly obsolete) that affect how text and
  space is processed and 95 printable characters, including the space
  (which is considered an invisible graphic).

-- 
Grant Edwards               grant.b.edwards        Yow! Used staples are good
                                  at               with SOY SAUCE!
                              gmail.com            
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: "convert" string to bytes without changing data (encoding)

Reply via email to