Angus Leeming <[EMAIL PROTECTED]> writes: | Lars Gullik Bjønnes <[EMAIL PROTECTED]> writes: | > | You mean wchar_t == uint32? UCS-4 is the encoding, no? | > | > yes. | > (but wchar_t on linux also follow the ucs-4 encoding...) | | ???? | There's no encoding in an int.
L"abc" is a wchar_t string that follows the UCS-4 encoding. | You mean that wchar_t on linux will hold 4bytes I think whereas on Windows | wchar_t will hold only 2bytes and therefore cannot hold all values that can be | stored in a UCS-4 encoded document. The 2 byte wchar_t is sufficient to hold all | values that are present in the Base Multilingual Plane. Values outside the BMP | would need an encoding other than UCS-4 (like the multi-atom UTF-16 encoding) if | the data were stored in a 2byte atom. UCS-2. 2-byte wchat_t cannot be used for UCS-4. (by definition) | At least, I *think* that's how it all fits together, but unicode tends to | confuse me ;-) | | > | typedef std::basic_string<lyx::char_type, lyx::uchar_traits> lyx::ustring; | > | What do I miss? | > Perhaps nothing. | | So why do you prefer std::vector<lyx::char_type> as the container? You're a | rationale bloke, so there must be a reason? I just picked something that would work. After all we are not going to use (any?) string operations on such strings. -- Lgb