Eryk Sun <eryk...@gmail.com> added the comment:
On most platforms, unless UTF-8 mode is enabled, locale.getpreferredencoding(False) returns the LC_CTYPE encoding of the current locale. For example, in Linux: >>> locale.setlocale(locale.LC_CTYPE, 'en_US.UTF-8') 'en_US.UTF-8' >>> locale.getpreferredencoding(False) 'UTF-8' >>> locale.setlocale(locale.LC_CTYPE, 'en_US.iso-88591') 'en_US.iso-88591' >>> locale.getpreferredencoding(False) 'ISO-8859-1' If the designers of the io module had wanted the preferred encoding to always be the default encoding from setlocale(LC_CTYPE, ""), they would have used and documented locale.getpreferredencoding(True). --- In Windows, locale.getpreferredencoding(False) always returns the default encoding from locale.getdefaultlocale(), which is the process active (ANSI) code page. Changing it to track the LC_CTYPE locale would be convenient for applications and scripts running in Windows 10, for which the CRT's POSIX locale implementation has supported UTF-8 since spring of 2018. The base behavior can't be changed at this point, but a -X option and/or environment variable could enable locale.getpreferredencoding(False) -- i.e. locale._get_locale_encoding() -- to return the current LC_CTYPE encoding in Windows, as it does in POSIX. ---------- nosy: +eryksun _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue43140> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com