At 12:25 PM 6/19/2001 -0700, Hong Zhang wrote:

> > >What do you mean by character size if it does not support variable
>length?
> >
> > Well, if strings are to be treated relatively abstractly, and we still
>want
> > to poke around through the string buffer, we need to know how big a
> > character is.
>
>I agree on this. I think support variable length encoding should be
>included.

We certainly need to support them. I'd rather the core interpreter code not 
deal with variable-width data directly, but I'm not sure that I'm going to 
get that wish. :(

> > >The byte based is more useful. I have utf-8, and I want to substr it
> > >to another utf-8. It is painful to convert it or linear search for
> > >charaacter position.
> >
> > The pain is the reason for specifying it in the API. If we force the pain
> > to be local to the encoding then it means that we don't have
> > to embed it in the core.
>
>If it is common API, I like to specify it in core, so each encoding
>implemetation can strictly follow. I believe it is common enough.

What I meant was that the core interpreter itself won't know how to grovel 
through a string buffer filled with variable-length characters. Instead it 
leaves that grovelling to the encoding specific code, which will 
(hopefully) be dynamically loadable. Basically the core specifies the API 
and conformant encoding implementations hide all the gory details.

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to