This is the common approach of complicated text representation,
the implemetations I have seen includes IBM IText and SGI
rope. For "rope", each rope is represented by either of a simple
immutable string, a simple mutable string, a simple immutable
substring of another rope, or a binary node of
Dan Sugalski wrote:
> >If the internal string API is a tree instead of a contiguous memory block,
> >the tagging could be done at the node or branch level.
> >
> >Besides, you get nondestructive inserts.
>
> Yup. The only problem is that it makes the string data significantly more
> complex. I d
At 04:08 PM 6/19/2001 -0500, David L. Nicol wrote:
>Dan Sugalski wrote:
> > Hong Zhang wrote:
> >
> > > I don't see the core should support language/locale in this detail.
> > > I deal a lot of mix chinese/english text file. There is no way to
> represent
> > > it using plain string, unless you w
Dan Sugalski wrote:
> Hong Zhang wrote:
>
> > I don't see the core should support language/locale in this detail.
> > I deal a lot of mix chinese/english text file. There is no way to represent
> > it using plain string, unless you want to make string be rich-format-text
> > -buffer. Current local
> Taiwanese read traditional chinese characters, but PRC people read
> simplied chinese. Even we take the same data, and same program (code),
> people just read differently. As an end user, I want to make the decision.
> It will drive me crazy if Perl render/display the text file using
> tradition
At 02:51 PM 6/19/2001 -0500, Jarkko Hietaniemi wrote:
> > Gah. I thought (and I use the word loosely here) that locales generally
> > specified how a particular character should be interpreted when there's
> > some ambiguity--the high bit ASCII characters spring to mind, given
> there's
> > a doz
> Gah. I thought (and I use the word loosely here) that locales generally
> specified how a particular character should be interpreted when there's
> some ambiguity--the high bit ASCII characters spring to mind, given there's
> a dozen or more different interpretations with them. I was under th
> I think you misunderstand my point. It is "a property of the code region",
> but "a property of the context in which is the code is running". For
> example,
> Taiwanese read traditional chinese characters, but PRC people read
> simplied chinese. Even we take the same data, and same program (code
At 02:31 PM 6/19/2001 -0500, Jarkko Hietaniemi wrote:
> > I think you misunderstand my point. It is "a property of the code region",
> > but "a property of the context in which is the code is running". For
> > example,
> > Taiwanese read traditional chinese characters, but PRC people read
> > simp
At 12:25 PM 6/19/2001 -0700, Hong Zhang wrote:
> > >What do you mean by character size if it does not support variable
>length?
> >
> > Well, if strings are to be treated relatively abstractly, and we still
>want
> > to poke around through the string buffer, we need to know how big a
> > characte
> >What do you mean by character size if it does not support variable
length?
>
> Well, if strings are to be treated relatively abstractly, and we still
want
> to poke around through the string buffer, we need to know how big a
> character is.
I agree on this. I think support variable length
> * Convert from and to UTF-32
> * lengths in bytes, characters, and possibly glyphs
> * character size (with the variable length ones reporting in negative
numbers)
What do you mean by character size if it does not support variable length?
> * get and set the locale (This might not be the spot
Since we're going to try and take a shot at being encoding-neutral in the
core, we're going to need some form of string API so the core can actually
manipulate string data. I'm thinking we'll need to be able to at least do
this with string:
* Convert from and to UTF-32
* lengths in bytes, char
13 matches
Mail list logo