On Fri, Oct 21, 2016 at 3:24 PM, Gabor Boros via Lazarus
<lazarus@lists.lazarus-ide.org> wrote:
> Why the below example better than a for loop with UTF8Length and UTF8Copy
> for go through the string?

Because it is MUCH faster. It scales linearly, O(n).
Calling UTF8Length() and UTF8Copy() inside the loop makes it
polynomial O(n^2) or worse depending on how many UTF8...() calls you
have there.

Yes, we have seen complaints that UTF-8 is unusable because you must
use the slow UTF8Length() and UTF8Copy(), and UTF-16 is better because
you can use fixed width S[i] indexing.
That is obviously based on misunderstanding of both encodings.

Hint: if you need to iterate CodePoints, you can also use the
enumerator from LazUnicode unit. It uses the same concept as the
example in wiki page. It allows this code:

  for ch in s do
    writeln('ch=',ch);

and the same code even works in Delphi with UTF-16. Cool, ha!?

Juha
-- 
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
http://lists.lazarus-ide.org/listinfo/lazarus

Reply via email to