[fpc-pascal] converting to UTF8

2022-03-23 Thread cibersvaa--- via fpc-pascal


Lazarus 2.012
FPC: 3.2.0
SVN: 64642
OS: Windows 10 Pro/win64

I'm reading from a file with character set win1252, I want to convert  
it to utf8, but I can't.


procedure TestString;
var
  Original:string;
  Converted:string;
begin
  original:='ESPA'#209'A'; //ESPAÑA WIN1252
  Converted:=ansiToUtf8(original);  // converts to 'ESPA'#239#191#189'A'
 // converted Should be 'ESPA'#195#145'A'
end;

I've tried playing with strings types, string, rawstring,ansistring,  
utf8string. No way. Any hint?



Saludos
Santiago A.



___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] converting to UTF8

2022-03-23 Thread LacaK via fpc-pascal



procedure TestString;
var
  Original:string;
  Converted:string;
begin
  original:='ESPA'#209'A'; //ESPAÑA WIN1252
  Converted:=ansiToUtf8(original);  // converts to 'ESPA'#239#191#189'A'
 // converted Should be 'ESPA'#195#145'A'
end;

I've tried playing with strings types, string, rawstring,ansistring, 
utf8string. No way. Any hint?




To explicityly convert between run-time code pages you can use 
procedure: SetCodePage(var s: RawByteString; CodePage: TSystemCodePage; 
Convert: Boolean = True)


SetCodePage(original, 1252, False);
SetCodePage(original, CP_UTF8, True);

-Laco.

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] converting to UTF8

2022-03-23 Thread Mattias Gaertner via fpc-pascal
On Tue, 22 Mar 2022 04:47:50 -0400
cibersvaa--- via fpc-pascal  wrote:

> Lazarus 2.012
> FPC: 3.2.0
> SVN: 64642
> OS: Windows 10 Pro/win64
> 
> I'm reading from a file with character set win1252, I want to convert
> it to utf8, but I can't.
> 
> procedure TestString;
> var
>Original:string;
>Converted:string;
> begin
>original:='ESPA'#209'A'; //ESPAÑA WIN1252

FPC does not yet understand comments, so maybe it does not know this
literal is cp1252. Add {$codepage cp1252} somewhere at the start of
the unit.

If this is part of a Lazarus application, then String is by
default UTF-8, so your "original" is already converted to UTF-8.

>Converted:=ansiToUtf8(original);  // converts to
> 'ESPA'#239#191#189'A' // converted Should be 'ESPA'#195#145'A'

If you want to load a string encoded in CP1252, then you
can use from unit lconvencoding:

s:=CP1252ToUTF8(StringFromFile);


> end;
> 
> I've tried playing with strings types, string, rawstring,ansistring,  
> utf8string. No way. Any hint?


Mattias
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal