On 2020-09-12 18:51, Jonas Maebe via fpc-pascal wrote:
On 12/09/2020 18:44, Sven Barth via fpc-pascal wrote:
Jonas Maebe via fpc-pascal <fpc-pascal@lists.freepascal.org
<mailto:fpc-pascal@lists.freepascal.org>> schrieb am Sa., 12. Sep.
2020,
17:47:
> All the doubts, questions, and discussions prove that current
system is
> counter-intuitive and confusing.
The issue in this thread is caused by a bug in the LCL: it blindly
assumes that the dynamic code page of the caption string is always
utf-8. That is simply wrong (unless you put the burden on the user
to
always assign an utf-8-encoded string to it, but _that_ is
counter-intuitive and confusing).
But shouldn't the compiler insert a conversion if the string is
declared
as CP_1250 and the destination is CP_ACP?
There are two things:
1) regardless of how the static code page of a string is declared, it
is
never guaranteed that its dynamic code page will match it. The simplest
example is when you assign a RawByteString to it. There is, however,
also a second case (and this one indeed is counter-intuitive, but
needed
for backward compatibility):
2) the second bullet under
https://wiki.freepascal.org/FPC_Unicode_support#Dynamic_code_page
That's what gets triggered here: the source file CP is CP_1250 and the
string is also ansistring(1250). That case would be solved by declaring
Label as UTF8String though.
Yep.
While performing some tests, I came across other things which are not
very nice either (those are specific to the Win32/Win64 target due to
the difference between process codepage and console codepage). Let's
take the following test program:
{$codepage cp1250}
{$IFDEF USECRT}
uses
Crt;
{$ENDIF USECRT}
const
S = 'žluťoučký kůň';
var
T: string;
begin
T := S;
{$IFDEF USECRT}
Write ('Using Crt');
{$ELSE USECRT}
Write ('Not using Crt');
{$ENDIF USECRT}
WriteLn (S);
WriteLn (T);
WriteLn (DefaultSystemCodepage);
WriteLn (TextRec (Output).Codepage);
end.
Let's compile it _without_ -dUSECRT and _with_ -Mfpc first. The original
poster uses the same default codepage as me. If I start cmd.exe and run
"chcp" without parameters, it shows codepage 852 as the console
codepage. Now run the test program. It shows that the codepage for the
default file handle Output matches the console codepage (as it should),
but the string output is incorrect for both WriteLn(S) and WriteLn(T)
lines. If you perform "chcp 1250" and run the program again, the
codepages match and the string output is correct.
If you compile the same program with -dUSECRT, the output is correct for
both WriteLn calls regardless from the console codepage setting (i.e.
both for "chcp 852" and for "chcp 1250" - and also for "chcp 65001").
If you compile the same program with -dUSECRT and -Mdelphi together and
run the program in a console window set to codepage 852 (i.e. the
default setting here), the first WriteLn call is wrong, whereas the
second gives a correct result (due to the fact that T becomes an
ansistring in mode Delphi and dynamic translation is thus performed as
opposed to the case when a shortstring or an untyped constant are
passed).
Tomas
_______________________________________________
fpc-pascal maillist - fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal