Strake dixit: >On 26/11/2013, Silvan Jegen <s.je...@gmail.com> wrote: >> If you you would rather not take this version, what approach would >> you take for the character set mapping when using UTF-8? > >On Linux, one can easily make a sparse array with 1-page granularity >with mmap, and so simply use a (wchar_t []) or (Rune []), but I'm not >sure how portable this is.
Pretty portable, and 2²¹ * sizeof(wchar_t)/CHAR_BITS is at best 2²⁵ or 32 MiB, so this would even work. But common, for Unicode, is to use the planes. struct { wchar_t foo[0x100]; } *repl[0x1100]; Do note that sizeof(wchar_t) may be 16, and that the OS’ own representation of wchar_t may not be Unicode, so the type would be semantically wrong. You might want to use uint32_t there. bye, //mirabilos -- „Also irgendwie hast du IMMER recht. Hier zuckelte gerade ein Triebwagen mit der Aufschrift "Ostdeutsche Eisenbahn" durch Wuppertal. Ich glaubs machmal nicht…“ -- Natureshadow, per SMS „Hilf mir mal grad beim Denken“ -- Natureshadow, IRL, 2x