Re: Unicode - help required

Lars Gullik Bjønnes Tue, 06 Jun 2006 09:39:35 -0700

Angus Leeming <[EMAIL PROTECTED]> writes:

| Lars Gullik BjÃ¸nnes <[EMAIL PROTECTED]> writes:
| > | You mean wchar_t == uint32? UCS-4 is the encoding, no?
| > 
| > yes.
| > (but wchar_t on linux also follow the ucs-4 encoding...)
| 
| ????
| There's no encoding in an int.


L"abc" is a wchar_t string that follows the UCS-4 encoding.
 
| You mean that wchar_t on linux will hold 4bytes I think whereas on Windows
| wchar_t will hold only 2bytes and therefore cannot hold all values that can be
| stored in a UCS-4 encoded document. The 2 byte wchar_t is sufficient to hold 
all
| values that are present in the Base Multilingual Plane. Values outside the BMP
| would need an encoding other than UCS-4 (like the multi-atom UTF-16 encoding) 
if
| the data were stored in a 2byte atom.

UCS-2.

2-byte wchat_t cannot be used for UCS-4.
(by definition)

| At least, I *think* that's how it all fits together, but unicode tends to
| confuse me ;-)
| 
| > | typedef std::basic_string<lyx::char_type, lyx::uchar_traits> lyx::ustring;
| > | What do I miss?
| > Perhaps nothing.
| 
| So why do you prefer std::vector<lyx::char_type> as the container? You're a
| rationale bloke, so there must be a reason?

I just picked something that would work. After all we are not going to
use (any?) string operations on such strings.

-- 
        Lgb

Re: Unicode - help required

Reply via email to