[issue44510] file.read() UnicodeDecodeError with UTF-8 BOM in files on Windows

2021-06-25 Thread Eryk Sun
Eryk Sun added the comment: > On Windows we currently still default to your console encoding In Windows, the default encoding for open() is the ANSI code page of the current process [1], from GetACP(), which is based on the system locale, unless it's overridden to UTF-8 in the application ma

[issue44510] file.read() UnicodeDecodeError with UTF-8 BOM in files on Windows

2021-06-25 Thread Steve Dower
Steve Dower added the comment: The file that fails contains a UTF-8 BOM at the start, which is a multibyte character indicating that the file is definitely UTF-8. Unfortunately, none of Python's default settings will handle this, because it's a convention that only really exists on Windows.