Dan Sugalski <[EMAIL PROTECTED]> wrote: > Synthesized code points >=======================
> Parrot provides code points for all graphemes, even for those > character sets/encodings which don't inherently do so. Most sets that > have variable-length encodings use an escape sequence scheme--the > value of the first byte in a character determines whether the > grapheme is a one or more byte sequence. Doing so would need that Parrot has initimate knowledge of the encoding. OTOH you are writing that we don't convert in the first place. Seems to be a contradiction. > (u)getstring Sw, Sx, Iy, Iz > (u)setstring Sw, Sx, Iy, Iz Does that mean that the current C<substr> opcodes get tossed? > encoding Ix, Sy > charset Ix, Sy How do we enumerate encodings and charsets? ICU's ucnv interface takes an "encoding name". leo