On 9/22/19, Albert-Jan Roskam <sjeik_ap...@hotmail.com> wrote: > > Do you think it's a deliberate design choice that decimal and thousands > where used here as params, and not a 'locale' param? It seems nice to be > able to specify e.g. locale='dutch' and then all the right lc_numeric, > lc_monetary, lc_time where used. Or even locale='nl_NL.1252' and you also > wouldn't need 'encoding' as a separate param. Or might that be bad on > windows where there's no locale-gen? Just wondering...
FYI, while Windows is distributed with many locales and also supports custom locales (not something I've had to work with), at least based on standard locale data, "nl_NL.1252" is not a valid locale for use with C setlocale in Windows. Classically, the C runtime in Windows supports "C" (but not "POSIX") and locales based on a language name or its non-standard three-letter abbreviation. A locale can also include a country/region name (full or three-letter abbreviation), plus an optional codepage. If the latter is omitted, it defaults to the ANSI codepage of the language, or of the system locale if the language has no ANSI codepage . Examples: >>> locale.setlocale(0, 'dutch') 'Dutch_Netherlands.1252' >>> locale.setlocale(0, 'nld') 'Dutch_Netherlands.1252' >>> locale.setlocale(0, 'nld_NLD.850') 'Dutch_Netherlands.850' There are also a few compatibility locales such as "american" and "canadian": >>> locale.setlocale(0, 'american') 'English_United States.1252' >>> locale.setlocale(0, 'canadian') 'English_Canada.1252' Classically, the Windows API represents locales not as language/region strings but as numeric locale identifiers (LCIDs). However, back in 2006, Windows Vista introduced locale names, plus a new set of functions that use locale names instead of LCIDs (e.g. GetLocaleInfoEx). Locale names are based on BCP-47 language tags, which include at least an ISO 639 language code. They can also include an optional ISO 15924 script code (e.g. "Latn" or "Cyrl") and an optional ISO 3166-1 region code. Strictly speaking, the codes in a BCP-47 language tag are delimited only by hyphens, but newer versions of Windows in most cases also allow underscore. The Universal CRT (used by Python 3.5+) supports Vista locale names. Recently it also supports using underscore instead of hyphen, plus an optional ".utf8" or ".utf-8" encoding. Only UTF-8 can be specified for a BCP-47 locale. If the encoding is not specified, BCP-47 locales use the language's ANSI codepage, or UTF-8 if the language has no ANSI codepage (e.g. "hi_IN"). Examples: >>> locale.setlocale(0, 'nl_NL') 'nl_NL' >>> locale.setlocale(0, 'nl_NL.utf8') 'nl_NL.utf8' Older versions of the Universal CRT do not support UTF-8 or underscore in BCP-47 locale names, so make sure a system is updated if you need these features. -- https://mail.python.org/mailman/listinfo/python-list