Strake dixit:

>On 26/11/2013, Silvan Jegen <s.je...@gmail.com> wrote:
>> If you you would rather not take this version, what approach would
>> you take for the character set mapping when using UTF-8?
>
>On Linux, one can easily make a sparse array with 1-page granularity
>with mmap, and so simply use a (wchar_t []) or (Rune []), but I'm not
>sure how portable this is.
Pretty portable, and 2²¹ * sizeof(wchar_t)/CHAR_BITS is at best 2²⁵
or 32 MiB, so this would even work.

But common, for Unicode, is to use the planes.

struct {
        wchar_t foo[0x100];
} *repl[0x1100];

Do note that sizeof(wchar_t) may be 16, and that the OS’ own
representation of wchar_t may not be Unicode, so the type would
be semantically wrong.

You might want to use uint32_t there.

bye,
//mirabilos
-- 
„Also irgendwie hast du IMMER recht. Hier zuckelte gerade ein Triebwagen mit
der Aufschrift "Ostdeutsche Eisenbahn" durch Wuppertal. Ich glaubs machmal
nicht…“                                         -- Natureshadow, per SMS
„Hilf mir mal grad beim Denken“                 -- Natureshadow, IRL, 2x

Reply via email to