On Wed, Sep 27, 2000 at 01:26:18PM -0400, Dan Sugalski wrote:
> At 11:04 PM 9/26/00 +0100, Simon Cozens wrote:
> >Well, let's go over here, then. I just submitted an RFC for internal string
> >abstraction, which may or may not be the same thing as what you were just
> >talking about.
>
> Well, the impression I get from Larry is that he wants the internals to be
> character-formatting neutral. Scalars should be able to provide their
> contents in a number of different formats, but perl would usually not have
> One True Format internally. (Though how many True Formats there were might
> depend on the size of the platform--tiny ones might be all Unicode, or
> ASCII, or something)
>
> I think the intention is that if data comes in in ASCII or EBCDIC or
> Unicode or Shift-JIS or binary or whatever then perl will keep it that way
> if it understands the format, and convert as needed by bits of the core.
Which to me would suggest that there would be an API whereby code would
ask "can I have the contents of this SV in format blah" (where blah was one
of the core's understood formats) and you'd get data either direct from the
(son of) SV, or trigger a conversion from whatever format the SV actually
held data in.
And all parts of perl would only get data via that API (ie all files not
called sv.c for current perl on a strict interpretation)
Whether that conversion was cached (And how long for) would be left as
an exercise for the garbage collector.
Probably one actually wants a more flexible API where the caller specifies
a list of acceptable encodings (Bitmask?) and gets back data in the format
that is least work for the internals. Clearly if you're lazy you ask
for everything as fixed width 32 bits, but if you're not you would switch
between your implementation for say 7 bit ASCII and for utf depending on
what you got back.
Nicholas Clark