Jonas Maebe schreef:

On 10 Nov 2008, at 17:00, Vincent Snijders wrote:

procedure TForm1.Button1Click(Sender: TObject);
var
 w: widestring;
 i: integer;
begin
 w := UTF8Decode('hallo äöü');
 Edit1.Caption := UTF8Encode(w);

Note that if the file has been saved using an UTF-8 BOM, then the compiler will at compile time create a widestring containing the UTF-16 version of 'hallo äöü'. If you then pass this to a function expecting an ansistring (such as UTF8Decode), then the widestring manager will be used to decode that string and this decoded string will be passed to UTF8Decode. So then you'll pass an ansi-encoded string to UTF8Decode rather than an UTF-8-encoded string (unless ansi = utf-8 for the current execution).


Yes, that might be confusing. Therefore I don't recommend to save if with an UTF8-BOM of compile -Fcutf-8

It seems much more advisable to me to save the file with an UTF-8 BOM, or even better to add {$encoding utf-8} (and/or to pass -Fcutf-8 to the compiler) and then just use

Edit1.Caption := UTF8Encode('hallo äöü');

As an extra bonus of not adding the UTF-8 BOM, you don't have to use conversions to assign the UTF8 string in the source, translated by the compiler to a UTF16 string, to an UTF8 encoded ansistring. It saves a conversion at compile time and a conversion at run time.

Edit1.Caption := 'hallo äöü'.

Is there an explicit way to tell the compile not to convert widestring string constants, even if the file contains an UTF-8 BOM? The UTF-8 BOM might be usefull, if you want to edit the file with another text editor.

Vincent
_______________________________________________
fpc-devel maillist  -  [email protected]
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to