On Thu, Nov 28, 2013 at 11:45:33AM -0500, Strake wrote: > > (either using UTF-8 or UTF-32 indices), right? > > I meant Unicodepoints; those are just Unicodecs.
UTF-32 is an encoding that is identical to the unicode point as far as I know. So what I am thinking is that one would either use the UTF-8 representation of the Unicode point as an index, or the unicode point itself. Since using UTF-8 would not require any conversion (on UTF-8 locales) I think it would be preferrable. > > Since mmap needs a file descriptor argument > > It's ignored for anonymous map. The man page I read did not mention an anonymous map. I found an example on Wikipedia though anon = (char*)mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED, -1, 0); that probably means it may not be that portable after all. Thanks for making me aware of it in any case.