theo escreveu:
@Luiz Americo
Your code
WideCompareText(UTF8Decode(Key), UTF8Decode(Str))
will work, but if speed matters, then it's rather bad.
Hi, i'm aware that the performance is bad although had not tested like
you did, but at this point i'd like to stick with a solution that fpc
provides natively since it's being used in a fpc component
(TSqlite3Dataset).
In last revision i switched to the ansi version of the functions to save
the conversion of the Key at each comparison. See
http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/packages/fcl-db/src/sqlite/customsqliteds.pas?view=log#rev13431
Anyway is clear that functions to handle UTF8 and unicode in general is
missing in fpc...
I've tried to make a faster function for UTF-8:
... maybe your function can be used as a base to future development. Add
a new function to the widestringmanager?
Luiz
uses unicodeinfo, LCLProc;
function UTF8CompareText(s1, s2: UTF8String): Integer;
var u1, u2: Ucs4Char;
u1l, u2l: longint;
BytePos1, Len1, SLen1: integer;
BytePos2, Len2, SLen2: integer;
begin
Result := 0;
BytePos1 := 1;
BytePos2 := 1;
SLen1 := System.Length(s1);
SLen2 := System.Length(s2);
if SLen1 <> SLen2 then //Assuming lower/uppercase representations
have the same byte length
begin
if SLen1 > SLen2 then Result := 1 else Result := -1;
exit;
end;
repeat
u1 := UTF8CharacterToUnicode(@s1[BytePos1], Len1);
inc(BytePos1, Len1);
u2 := UTF8CharacterToUnicode(@s2[BytePos2], Len2);
inc(BytePos2, Len2);
if u1 <> u2 then
begin
{$IFDEF useunicodinfo}
u1l := unicodeinfo.utf8proc_get_property(u1)^.lowercase_mapping;
if u1l <> -1 then u1 := u1l;
u2l := unicodeinfo.utf8proc_get_property(u2)^.lowercase_mapping;
if u2l <> -1 then u2 := u2l;
{$ELSE}
u1 := UCS4Char(WideUpperCase(WideChar(u1))[1]);
u2 := UCS4Char(WideUpperCase(WideChar(u2))[1]);
{$ENDIF}
if u1 <> u2 then
begin
Result := u1 - u2;
exit;
end;
end;
until (BytePos1 > SLen1) or (BytePos2 > SLen2)
end;
Some numbers for my system (Linux) where WideCompareText is the function
you use now, WideUppercase is the above function and unicodeinfo is
the above function with useunicodinfo defined. See here
http://wiki.lazarus.freepascal.org/Theodp
Comparing identical Strings of 322 Chars 10000 times
WideCompareText: 785ms
unicodeinfo: 75ms
WideUpperCase: 74ms
Comparing Strings of 322 Chars 10000 times where the 3rd char differs
WideCompareText: 268ms
unicodeinfo: 3ms
WideUpperCase: 8ms
Comparing identical Text of 322 Chars 10000 times where one Text is all
uppercase
WideCompareText: 810ms
unicodeinfo: 121ms
WideUpperCase: 1076ms
Regards Theo
_______________________________________________
fpc-pascal maillist - fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
_______________________________________________
fpc-pascal maillist - fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal