On Wed, Aug 20, 2003 at 07:19:42PM -0400, Benjamin Goldberg wrote: > Leopold Toetsch wrote:
> > But these could be converted to utf32 as soon as they are seen. > > For a long string, that could be quite a bit of bloat. Jarkko's view is that the combined hit of the size of the extra code to skip along the variable length encoding, the time taken to execute that code, (and I guess the cache misses it creates) is greater than the gain from saving space. Particularly when the regexp engine is written assuming O(1) random access. He thinks perl 5 would probably have been faster if it used UCS32 internally. Maybe ponie will. Nicholas Clark