Ned Deily added the comment:

I've looked at this a bit, primarily on OS X 10.9 Mavericks, although I expect 
mostly similar behavior on older recent releases of OS X.  On 10.9, the setting 
of locale variables is done by whatever program is used to launch a shell.  I 
looked at the behavior of the built-in Terminal.app, the third-party 
iTerm2.app, the MacPorts distribution of xterm, and the built-in sshd.  By 
default, the latter two do not set any locale env variables.  Both Terminal.app 
and iTerm2.app set either LANG or LC_CTYPE based on the user's settings for 
"Region" and "Preferred Language" in the "System Preferences" -> "Language & 
Region" control panel.  Three examples:

1. "Region" = "United States", "Preferred Language" = "English":
    -> LANG=en_US.UTF-8

2. "Region" = "Germany", "Preferred Language" = "German"
    -> LANG=de_DE.UTF-8

3. "Region" = "Germany", "Preferred Language" = "English"
    -> LC_CTYPE= "UTF-8"

So it is almost certainly the last case that is under discussion here.  Whether 
or not that is a bug is not as clear as it might seem at first.  BSD 
implementations of locale differ from the GNU Linux version.  Both FreeBSD and 
OS X define a "UTF-8" locale that has only one locale category defined in it: 
LC_CTYPE.  It appears to be a fallback locale used when there is no applicable 
region / language combination, in this case no "en_DE*" locales.

$ ls /usr/share/locale/UTF*
LC_CTYPE

Compare with the en_US* locales:

$ ls /usr/share/locale/en_US*
/usr/share/locale/en_US:
LC_COLLATE  LC_CTYPE    LC_MESSAGES LC_MONETARY LC_NUMERIC  LC_TIME

/usr/share/locale/en_US.ISO8859-1:
LC_COLLATE  LC_CTYPE    LC_MESSAGES LC_MONETARY LC_NUMERIC  LC_TIME

/usr/share/locale/en_US.ISO8859-15:
LC_COLLATE  LC_CTYPE    LC_MESSAGES LC_MONETARY LC_NUMERIC  LC_TIME

/usr/share/locale/en_US.US-ASCII:
LC_COLLATE  LC_CTYPE    LC_MESSAGES LC_MONETARY LC_NUMERIC  LC_TIME

/usr/share/locale/en_US.UTF-8:
LC_COLLATE  LC_CTYPE    LC_MESSAGES LC_MONETARY LC_NUMERIC  LC_TIME

Now as I read the current POSIX standard, there is nothing wrong with this.  
AFAICT, the standard places no restriction on the format of locale names, in 
particular, it does not mandate that they conform to RFC 1766 or its 
successors.  Further, the standard provides for implementation-specific locales 
(other than the mandatory "POSIX" aka "C" locale) and some platforms provide 
tools to create custom locales, e.g. mklocale(1) on FreeBSD and OS X, 
localedef(1) on GNU Linux.  So I wonder if the locale module should really be 
imposing its own restrictions on locale names as it does currently.

>From IEEE Std 1003.1, 2013 Edition:
"The capability to specify additional locales to those provided by an 
implementation is optional, denoted by the _POSIX2_LOCALEDEF symbol. If the 
option is not supported, only implementation-supplied locales are available. 
Such locales shall be documented using the format specified in this section. 
[...] The locale definition file shall contain one or more locale category 
source definitions, and shall not contain more than one definition for the same 
locale category. [...]  In the event that some of the information for a locale 
category, as specified in this volume of POSIX.1-2008, is missing from the 
locale source definition, the behavior of that category, if it is referenced, 
is unspecified."

There is a further complication for OS X.  Apple provides a richer native API 
for locales, CFLocale (and its Cocoa equivalent, NSLocale).  So some nuances 
may get lost in the imperfect mapping between CFLocale and the conventional 
LC_* environment variables and between them and Python.  We could look at 
trying to support the native APIs as well.

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07
https://developer.apple.com/library/mac/documentation/CoreFoundation/Conceptual/CFLocales/CFLocales.html
https://developer.apple.com/library/mac/documentation/CoreFoundation/Reference/CFLocaleRef/Reference/reference.html

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18378>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to