Op Sat, 19 May 2007, schreef Felipe Monteiro de Carvalho:
> On 5/19/07, Rimgaudas Laucius <[EMAIL PROTECTED]> wrote:
> > It is not useful to have functions for both encodings, because these
> > encodings are interconvertable and it is more effective to use UTF-16 for
> > data processing
>
> I disagree. The conversion impacts performance heavely. It will also
> require memory to store the converted string, and after you perform a
> operation you need to convert back.
>
> Further, UTF-16 contains both 2-byte characters and 4-byte characters,
> so I don't see how it would be any faster to process it in comparison
> to process a utf-8 string.
For most operations, it is not necessary to process characters outside
the BMP separately, i.e.:
for i:=1 to length(s) do
s[i]:=upcase(i);
... is valid UTF-16 code, and much faster than the same operation in
UTF-8.
Daniël
_______________________________________________
fpc-pascal maillist - fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal