On Wed, Aug 10, 2005 at 02:56:46PM +0200, Leopold Toetsch wrote:
> Nicholas Clark via RT wrote:
> 
> >I thought that one thing Jarkko learned from perl 5's Unicode model was 
> >that
> >the amount of code and pain to support a variable length encoding was
> >greater than the space saving that that encoding gives.
> >
> >In turn Dan had decided that Parrot should internally unpack to some form
> >of fixed width encoding. So all Unicode would be stored internally in the
> >shortest of ISO-8859-1, UCS-16 and UCS-32 that encompassed all the code
> >points used.
> 
> Yes, with the enhancenment (also proposed by Dan) that a conversion to 
> fixed width encoding is done lazily i.e. on demand. The substr would be 
> typically such a place to change encoding to fixed.

Aha. That's the subtly that I missed from all this. The form of the "fix"

> >But having dealt with the fun of variable length encodings, my gut feeling
> >is with Jarkko, that it's probably better to stay fixed width internally.
> 
> My gut feeling is just the same.

Thanks for the clarification.

Nicholas Clark

Reply via email to