I never really used the UnicodeString (or WideString for that matter) - I've always used AnsiString with UTF-8 content. I also have my own UTF8 functions Copy(), Length() etc.
Looking at UnicodeString - with FPC 2.6.4 I seem a bit confused. :-/ Take the following code: ============================ {$mode objfpc}{$h+} {--- $codepage utf8} // disabled var S: UTF8String; // for FPC 2.6.4 this is an alias for AnsiString U: UnicodeString; begin S := 'Tiburón'; WriteLn(Length(S)); U := 'Tiburón'; WriteLn(Length(U)); ============================ On my 64-bit FreeBSD system that outputs the following: ========== 10 8 ========== Length() returns the number of bytes, correct? So why isn't the result 8 and 14? The letter o with acute is 2-bytes in UTF8 ($C3 & $B4). For Unicode (UTF-16), where a "character" is a word size (2-bytes), thus 2 bytes * 7 characters = 14 bytes. But Length() returns totally different values to what I expected. Enabling the {$codepage utf8} made no difference to the results shown above. Could anybody explain this please? Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ My public PGP key: http://tinyurl.com/graeme-pgp _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal