On 5/18/22, Chris Angelico <ros...@gmail.com> wrote: > > Real solution? Set the command prompt to codepage 65001. Then it > should be able to handle all characters. (Windows-65001 is its alias > for UTF-8.)
I suggest using win_unicode_console for Python versions prior to 3.6: https://pypi.org/project/win_unicode_console This package uses the console's native 16-bit character support with UTF-16 text, as does Python 3.6+. Compared to the console's incomplete and broken support for UTF-8, the console's support for UTF-16 (or just UCS-2 prior to Windows 10) is far more functional and reliable across commonly used versions of Windows 7, 8, and 10. Reading console input as UTF-8 is still limited to ASCII up to and including Windows 11, which for me is a showstopper. Non-ASCII characters are read as null bytes, which is useless. Support for writing UTF-8 to the console screen buffer is implemented correctly in recent builds of Windows 10 and 11, and mostly correct in Windows 8. Prior to Windows 8, writing UTF-8 to the console is badly broken. It returns the number of UTF-16 codes written instead of the number of bytes written, which confuses buffered writers into writing a lot of junk to the screen. -- https://mail.python.org/mailman/listinfo/python-list