Zvone wrote: >> However, as long as this codepage was one of the windows-xyz, >> single byte character sets converting back to Ansi with the >> same codepage should work without data loss and give you back >> the raw bytes (hopefully). This won't work, for example, with >> Japanese locale settings. > > But there is a string type for that purpose. It is called > RawByteString. It is defined as AnsiString($ffff) which in effect > means it is an ansistring with no encoding attached to it so you can > use it to transfer data from functions and avoid codepage conversions.
The purpose of type RawByteString is to avoid implicit string casts. As you know, the compiler is codepage aware of AnsiStrings since 2009+. Use of RawByteString makes sense as parameter, especially to avoid writing plenty of overloades, nothing else. > > Yes, by default it uses default system code page for conversions to > Unicode. > > RawByteString is a single-byte character type but unlike AnsiString it > does not have a specific encoding attached to it. So that means it can > be used to pass values to and from functions that will do > UnicodeConversions. It is not indended to be used for storing data, > just mostly for input/output of functions as the official > documentation specify. RawByteStrings are not implcitly converted by compiler magic, that's all, and this type is not documented well. > > So my best bet is that it would be the best to receive raw byte buffer > (unsigned char or BYTE type) and then place it into RawByteString and > return that value. This should avoid conversions. TBytes was the datatype to be used. However that would break backwards compatibility since historically "string" was used everywhere. > > In your own functions you can cast RawByteString as input type and use > conversion functions to convert from RawByteString to any codepage you > like (or store it as binary data). There are some functions that do > this I think the ones you need are SetCodePage() and StringCodePage(). That all was no problem, however the rule is: "DO NOT BREAK BACKWARDS COMPATIBILITY" and that is where the problems begin. > > Other than that, AnsiString can be defined in various codepages for > example you can declare a > typedef AnsiStringT<28591> Latin1String; and store data in > Latin1String type - this will ensure that the codepage conversions are > always in identical codepage and not dependent on the system code > page. I think the ICS components need a lot more changes (basic design changes). -- Arno Garrels -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be