Jonas Maebe schreef:
On 10 Nov 2008, at 17:00, Vincent Snijders wrote:
procedure TForm1.Button1Click(Sender: TObject);
var
w: widestring;
i: integer;
begin
w := UTF8Decode('hallo äöü');
Edit1.Caption := UTF8Encode(w);
Note that if the file has been saved using an UTF-8 BOM, then the
compiler will at compile time create a widestring containing the UTF-16
version of 'hallo äöü'. If you then pass this to a function expecting an
ansistring (such as UTF8Decode), then the widestring manager will be
used to decode that string and this decoded string will be passed to
UTF8Decode. So then you'll pass an ansi-encoded string to UTF8Decode
rather than an UTF-8-encoded string (unless ansi = utf-8 for the current
execution).
Yes, that might be confusing. Therefore I don't recommend to save if with an
UTF8-BOM of compile -Fcutf-8
It seems much more advisable to me to save the file with an UTF-8 BOM,
or even better to add {$encoding utf-8} (and/or to pass -Fcutf-8 to the
compiler) and then just use
Edit1.Caption := UTF8Encode('hallo äöü');
As an extra bonus of not adding the UTF-8 BOM, you don't have to use conversions to
assign the UTF8 string in the source, translated by the compiler to a UTF16 string,
to an UTF8 encoded ansistring. It saves a conversion at compile time and a
conversion at run time.
Edit1.Caption := 'hallo äöü'.
Is there an explicit way to tell the compile not to convert widestring string
constants, even if the file contains an UTF-8 BOM? The UTF-8 BOM might be usefull,
if you want to edit the file with another text editor.
Vincent
_______________________________________________
fpc-devel maillist - [email protected]
http://lists.freepascal.org/mailman/listinfo/fpc-devel