At 4:33 PM -0700 6/15/04, Damien Neil wrote:
On Jun 14, 2004, at 1:54 PM, Dan Sugalski wrote:
Parrot provides code points for all graphemes, even for those
character sets/encodings which don't inherently do so. Most sets that
have variable-length encodings use an escape sequence scheme--the
value of the first byte in a character determines whether the
grapheme is a one or more byte sequence. When parrot turns these into
code points it does it by building up the final value. The first byte
is put in the low 8 bits of the integer. If there's a second byte in
the sequence the current value is shifted left 8 bits and the new byte
is stuffed in the low 8 bits. If there's a third byte in the sequence
everything is shifted left again 8 bits and that third byte is stuffed
in the bottom, and so on.

A grapheme consists of one or more code points. Is "provides code points for all graphemes" really what is intended here?

D'oh! No, that's not intended at all. I was sloppy with the search&replace on this. Good catch--thanks.
--
Dan


--------------------------------------it's like this-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to