Andreas Dorn wrote on Fri, 25 Sep 2015:
If I understand that correctly, it stores the filename in a string that has been tagged as valid UTF-16.
There are no tags for valid, invalid or unchecked UTF-16. A unicodestring is basically a sequence of widechars. Some operations, such as converting to a different string type, uppercasing/lowercasing, case-insensitive comparison etc may however fail in case it's not valid UTF-16.
Do I then have to be careful about any automagic conversions?
As long as you don't assign it to another string type besides unicodestring and widestring (this includes passing it as a parameter to a routine expecting a non-unicode/widestring parameter): no.
Is it safe to pass the Filename to procedures from the RTL without risking corruption?
It depends which RTL procedures.
For me this is more a general problem when dealing with external data. Should I tag raw external data as UTF-16/UTF-8 and be super-careful, or should I tag it as some kind of "raw" string (which one?) and handle any conversions manually.
There is no "raw string" tag. There are only strings and arrays of bytes/words. There are no RTL file APIs accepting arrays of byte/word parameters.
(Now lets better not start about the encoding of Filenames on non-Windows OS... :-))
Everything about that for the supported platforms is explained at http://wiki.freepascal.org/FPC_Unicode_support#DefaultFileSystemCodePage
Jonas _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal