On Sun, Jan 24, 2021 at 10:17 AM Guido van Rossum <[email protected]> wrote:
>
> I have definitely seen BOMs written by Notepad on Windows 10.
>
> Why can’t the future be that open() in text mode guesses the encoding?

I don't like guessing. As a Japanese, I have seen many mojibake caused
by the wrong guess.
I don't think guessing encoding is not a good part of reliable software.

On the other hand, if we add `open_utf8()`, it's easy to ignore BOM:

* When reading, use "utf-8-sig". (it can read UTF-8 without bom)
* When writing, use "utf-8".

Although UTF-8 with BOM is not recommended, and Notepad uses UTF-8
without BOM as default encoding from 1903, UTF-8 with BOM is still
used in some cases.
For example, Excel reads CSV file with UTF-8 with BOM or legacy
encoding. So some CSV files is written with BOM.

Regards,
-- 
Inada Naoki  <[email protected]>
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/BJC6LCYNO2HHRLHF4TFHWTG53M4YL6LL/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to