Eryk Sun <eryk...@gmail.com> added the comment:

> Read the ANSI code page on Windows,

I don't see why the Windows implementation is inconsistent with POSIX here. If 
it were changed to be consistent, the default encoding at startup would remain 
the same, since setlocale(LC_CTYPE, "") uses the process code page from 
GetACP(). In many if not most cases, no one would be the wiser. But it seems to 
me that if a script calls setlocale(LC_CTYPE, "el_GR"), then it clearly wants 
to encode Greek text (code page 1253). open() with encoding passed as None or 
"locale" should respect this. Similarly if it calls setlocale(LC_CTYPE, 
".UTF-8"), then it wants the default locale (language/region), but with UTF-8 
encoding.

The following is a snippet to get the current locale encoding with ucrt in 
Windows:

    #include <locale.h>

    int cp = 0;
    __crt_locale_data_public *locale_data;

    _locale_t locale = _get_current_locale();
    if (locale) {
        locale_data = (__crt_locale_data_public *)locale->locinfo;
        cp = locale_data->_locale_lc_codepage;
       _free_locale(locale);
    }

    if (cp == 0) {
    /* "C" locale. The CRT in effect uses Latin-1 (cp28591), but 
       Windows Python prefers the process code page. */
        cp = GetACP();
    }

With ucrt, the C runtime was changed to hide most of the locale definition that 
was previously public, but it intentionally defines __crt_locale_data_public, 
so I'm assuming it's there for programs to use. That said, the fact that we 
have to cast locinfo seems suspect to me. Steve Dower could maybe check with 
the ucrt devs to ensure that this is supported. 

There's also ___lc_codepage() to get the same value more simply, and also more 
efficiently since the current locale data doesn't have to be copied and freed. 
However, it's documented as internal and could be removed (unlikely as that is).

----------
nosy: +eryksun

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue43552>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to