On Tue, Sep 29, 2009 at 04:32:49PM -0400, Tom Lane wrote: > Roger Leigh <rle...@codelibre.net> writes: > >> C locale means POSIX behavior and nothing but. > > > Indeed it does. However, making LC_CTYPE be UTF-8 rather than > > ASCII is both possible and still strictly conforming to the > > letter of the standard. There would be some collation and > > other restrictions ("digit" and other character classes would > > be contrained to the ASCII characters compared with other UTF-8 > > locales). However, any existing programs using ASCII would continue > > to function without any changes to their behaviour. The only > > observable change will be that nl_langinfo(CODESET) will return > > UTF-8, and it will be valid for programs to use UTF-8 encoded > > text in formatted print functions, etc.. > > I really, really don't believe that that meets either the letter or > the spirit of the C standard, at least not if you are intending to > silently substitute LC_CTYPE=UTF8 when the program has specified > C/POSIX locale. (If this is just a matter of what the default > LANG environment is, of course you can do anything.)
We have spent some time reading the relevant standards documents (C, POSIX, SUSv2, SUSv3) and haven't found anything yet that would preclude this. While they all specify minimal requirements for what the C locale character set must provide (and POSIX/SUS are the most strict, specifying ASCII outright for each 0-127 codepoint), these are the minimal requirements for the locale, and implementation-specific extensions to ASCII are allowed, which would therefore permit UTF-8. Note that LC_CTYPE=C is not required to specify ASCII in any standard (though POSIX/SUS require that it must contain ASCII as a subset of the whole set). The language in SUSv2 in fact explicitly states that this is allowed. In fact, I've seen documentation that some UNIX systems such as HPUX already do have a UTF-8 C locale as an option. Regards, Roger -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/ `- GPG Public Key: 0x25BFB848 Please GPG sign your mail. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers