On Wed, 11 May 2016, Graeme Geldenhuys wrote:

Hi,

Here is an example [proof if you will] of the problem. I wrote a small
test program that reads data from a Firebird database where the database
and field charset is set to UTF8.

I compile the program, then run it. No recompiles between the two runs.
The first run my system is set to have a UTF-8 locale. The second run is
where I set my system to have a ISO8859-1 (Latin-1) locale. The program
outputs the DefaultSystemCodePage to the console.

Because the locale changes the behaviour of String (aka AnsiString) in
the RTL and FCL, the first run works, but the second run corrupts my data.

Console output:

[unicode_test]$ export LANG=en_US.UTF-8
[unicode_test]$ ./unicodetest
65001

[unicode_test]$ export LANG=en_US.ISO8859-1
[unicode_test]$ ./unicodetest
28591


In my test program I write the data read from the database to a file
using TFileStream, thus console and file encoding settings will not
affect the data being written to file. TFileStream is simply writing bytes.

But what does your program prove ?

You're only proving that a conversion happens when you do
s := fieldByName('somefield').asString;
and that the conversion takes into account the locale, which in one of the 2
runs is different from the actual locale data in the database.

This conversion is as-designed, and known to be wrong in the case of TField.AsString, but will not be solved by simply using {$modeswitch unicodestring} in the database code.

AFAIK 3.0 is no different in this matter from 2.6.4, Jonas can confirm/deny. Unlike 2.6.4, 3.0.0 offers us the possibility to fix it by allowing to specify the codepage in TField. This is not yet implemented, however.

Michael.
_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Reply via email to