Hi, > > >> If your source code is in UTF8 but you do not tell this to the compiler, > ...
> > tried that some of the time. I have never had any problems other than > the > > Windows Console issue. > > As mentioned before, that causes the compiler to directly pass your UTF-8 > data around with any conversion. That is how Lazarus works (it stores UTF8 > data in plain ansistrings) and as long as you only use LCL routines it will > work fine, but using such code directly with the OS API (via the FPC RTL or > not) will obviously cause problems if that API does not expect data in > UTF-8 format. > > I pretty much use only UTF8 in most programs, and make exclusive use of UTF8String. (I only use ANSIString if the string is known to be in a local encoding, such as having been read from an SJIS file) - but I was under the understanding that UTF8String is just an alias to ANSIString for now anyway. I convert if necessary when using the OS APIs. (The main problem is knowing when it is necessary...) > >>> Detecting the Unicode BOM or not seems to be a strange way to switch > the > >>> behavior of the compiler, > >>> > > to make the encoding of the file clear, but not so much as a mode switch > > (which is what the Wiki makes it sound like). > > It causes the compiler to interpret the string constants in your program > as UTF-8 rather than as unknown binary data, and hence convert them at run > time to the current ansi code page when assigning them to an > ansistring/shortstring. This is unrelated to mode switches. > Ooh, this clears up a lot that the Wiki didn't explain very well! So basically saving the file with BOM tells the compiler/RTL to take care of things for you, and saving as UTF8 without BOM is appropriate if you will take care of any conversions yourself. If UTF8String is just an alias to ANSIString, then I assume it also means that right now the compiler would convert such constants to the local encoding even when assigning to a UTF8String? (If so, this explains why my finally working console code works only with No BOM). Also, I assume that the treatment of ResourceString and any other constants is the same? >> Those programs would be wrong. A user can easily change the console > output> > >> > > not even have the fonts for and then complains that the program doesn't > > work. The answer will be "We only support Japanese systems." > > Those are business decisions, which are irrelevant to the FPC RTL. Our > code is written as much as possible to work correctly under all > circumstances. > > To be sure, that's an admirable goal! > > Jonas Thanks for your comments, Noah Silva
_______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal