On Thu, 31 Mar 2016 00:16:13 +0200 "Michael W. Vogel" <m-w-vo...@gmx.de> wrote:
>[...] > I've tested the example too and I got different results with different > options. The test was: > - BOM / no BOM at the beginning of the sourcefile > - {$codepage UTF8} or not The compiler understands -FcUTF8, {$codepage utf8} and BOM. All three sets UTF-8. See here: http://wiki.freepascal.org/FPC_Unicode_support#Source_file_codepage BOM has the advantage that it is understood by other text editors as well and the disadvantage that it is hidden, so that people unaware of encodings are easily confused. -FcUTF8 has the advantage of applying it to all sources in the project/package and it can easily be turned off. You can unset it for a single unit via {$modeswitch systemcodepage}. > - fpc -MObjFPC *-Sh* test.pas (with / without -Sh (use reference counted > strings)) And this is where the confusion starts. Mixing multiple string types is asking for troubles. FPC has an impressive (aka frightening) list of string types and consequently a vast net of combinations that only graph theorists can appreciate. > So it is realy more complex as I thought... Yes. And you have not yet explored the difficulties in code supporting both FPC 2.6.4 and 3+ and LCL 1.4 and 1.6. Although Lazarus recommends to "simply" use UTF-8, technically it recommends AnsiString, DefaultSystemCodepage CP_UTF8, no explicit codepage, and the UTF-8 functions in LazUtils. If you need to use other string types in an unit you might want to add an explicit codepage. Maybe a paragraph should be added to the wiki about using non AnsiString with the "Lazarus UTF-8". Mattias -- _______________________________________________ Lazarus mailing list Lazarus@lists.lazarus.freepascal.org http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus