On 2020-09-12 18:51, Jonas Maebe via fpc-pascal wrote:
On 12/09/2020 18:44, Sven Barth via fpc-pascal wrote:
Jonas Maebe via fpc-pascal <fpc-pascal@lists.freepascal.org
<mailto:fpc-pascal@lists.freepascal.org>> schrieb am Sa., 12. Sep. 2020,
17:47:

    > All the doubts, questions, and discussions prove that current
    system is
    > counter-intuitive and confusing.

    The issue in this thread is caused by a bug in the LCL: it blindly
    assumes that the dynamic code page of the caption string is always
utf-8. That is simply wrong (unless you put the burden on the user to
    always assign an utf-8-encoded string to it, but _that_ is
    counter-intuitive and confusing).


But shouldn't the compiler insert a conversion if the string is declared
as CP_1250 and the destination is CP_ACP? 

There are two things:
1) regardless of how the static code page of a string is declared, it is
never guaranteed that its dynamic code page will match it. The simplest
example is when you assign a RawByteString to it. There is, however,
also a second case (and this one indeed is counter-intuitive, but needed
for backward compatibility):
2) the second bullet under
https://wiki.freepascal.org/FPC_Unicode_support#Dynamic_code_page

That's what gets triggered here: the source file CP is CP_1250 and the
string is also ansistring(1250). That case would be solved by declaring
Label as UTF8String though.

Yep.

While performing some tests, I came across other things which are not very nice either (those are specific to the Win32/Win64 target due to the difference between process codepage and console codepage). Let's take the following test program:

{$codepage cp1250}
{$IFDEF USECRT}
uses
 Crt;
{$ENDIF USECRT}
const
 S = 'žluťoučký kůň';
var
 T: string;
begin
 T := S;
{$IFDEF USECRT}
 Write ('Using Crt');
{$ELSE USECRT}
 Write ('Not using Crt');
{$ENDIF USECRT}
 WriteLn (S);
 WriteLn (T);
 WriteLn (DefaultSystemCodepage);
 WriteLn (TextRec (Output).Codepage);
end.

Let's compile it _without_ -dUSECRT and _with_ -Mfpc first. The original poster uses the same default codepage as me. If I start cmd.exe and run "chcp" without parameters, it shows codepage 852 as the console codepage. Now run the test program. It shows that the codepage for the default file handle Output matches the console codepage (as it should), but the string output is incorrect for both WriteLn(S) and WriteLn(T) lines. If you perform "chcp 1250" and run the program again, the codepages match and the string output is correct.

If you compile the same program with -dUSECRT, the output is correct for both WriteLn calls regardless from the console codepage setting (i.e. both for "chcp 852" and for "chcp 1250" - and also for "chcp 65001").

If you compile the same program with -dUSECRT and -Mdelphi together and run the program in a console window set to codepage 852 (i.e. the default setting here), the first WriteLn call is wrong, whereas the second gives a correct result (due to the fact that T becomes an ansistring in mode Delphi and dynamic translation is thus performed as opposed to the case when a shortstring or an untyped constant are passed).

Tomas
_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Reply via email to