On Wed, Mar 30, 2011 at 04:39:30PM -0500, Peter Samuelson wrote: > > [Roger Leigh] > > As a followup, I would like to get the UTF-8 codeset and collation > > hardcoded in libc6 directly and sharable by all UTF-8 locales to > > reduce startup time and needless duplication > > Collation is not just a function of character set, it's quite > locale-dependent. Not sure if the character class tables (<ctype.h> > functions, and the [:foo:] posix regex classes) could be shared across > UTF-8 locales. I rather suspect not.
Maybe I'm just thinking of ctype. I thought that (possibly due to having __STDC_ISO_10646__) the character classes were identical across all locales. Collation is probably different. > When you take out collation and possibly character classes, I'm not > sure whether there's anything in the UTF-8 locales left to hardcode > into libc. There's the actual charmap (localedata/charmaps/UTF-8), which is big and well worth sharing between locales irrespective of hardcoding. Looking at it again, I only see the C ctype hardcoded; not the charmap, so maybe it's autogenerated or not even hardcoded (since it's a 1:1 ASCII:UCS mapping for C). It would be easier to grok what's going on if it wasn't so hideously complex and undocumented! Regards, Roger -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/ `- GPG Public Key: 0x25BFB848 Please GPG sign your mail.
signature.asc
Description: Digital signature