Nicholas Clark via RT wrote:

I thought that one thing Jarkko learned from perl 5's Unicode model was that
the amount of code and pain to support a variable length encoding was
greater than the space saving that that encoding gives.

In turn Dan had decided that Parrot should internally unpack to some form
of fixed width encoding. So all Unicode would be stored internally in the
shortest of ISO-8859-1, UCS-16 and UCS-32 that encompassed all the code
points used.

Yes, with the enhancenment (also proposed by Dan) that a conversion to fixed width encoding is done lazily i.e. on demand. The substr would be typically such a place to change encoding to fixed.

But having dealt with the fun of variable length encodings, my gut feeling
is with Jarkko, that it's probably better to stay fixed width internally.

My gut feeling is just the same.

Nicholas Clark

leo

Reply via email to