On Sun, Mar 26, 2017 at 5:58 PM, Chris Angelico <ros...@gmail.com> wrote: >> The Windows console can render any character in the BMP, but it >> requires configuring font linking for fallback fonts. It's Windows, so >> of course the supported UTF format is UTF-16. The console's UTF-8 >> support (codepage 65001) is too buggy to even consider using it. > > Is it actually UTF-16, or is it UCS-2?
Pedantically speaking it's UCS-2. Console buffers aren't necessarily valid UTF-16, i.e. they can have lone surrogate codes or invalid surrogate pairs. The way a surrogate code gets rendered depends on the font. It could be an empty box, a box containing a question mark, or simply empty space. That applies even if it's a valid UTF-16 surrogate pair, so the console can't display non-BMP characters such as emojis. They can be copied to the clipboard and displayed in another program. Windows file systems are also UCS-2. For the most part it's not an issue since the source of text and filenames will be valid UTF-16. -- https://mail.python.org/mailman/listinfo/python-list