On Sun, Jan 24, 2021 at 10:17 AM Guido van Rossum <[email protected]> wrote: > > I have definitely seen BOMs written by Notepad on Windows 10. > > Why can’t the future be that open() in text mode guesses the encoding?
I don't like guessing. As a Japanese, I have seen many mojibake caused by the wrong guess. I don't think guessing encoding is not a good part of reliable software. On the other hand, if we add `open_utf8()`, it's easy to ignore BOM: * When reading, use "utf-8-sig". (it can read UTF-8 without bom) * When writing, use "utf-8". Although UTF-8 with BOM is not recommended, and Notepad uses UTF-8 without BOM as default encoding from 1903, UTF-8 with BOM is still used in some cases. For example, Excel reads CSV file with UTF-8 with BOM or legacy encoding. So some CSV files is written with BOM. Regards, -- Inada Naoki <[email protected]> _______________________________________________ Python-ideas mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/[email protected]/message/BJC6LCYNO2HHRLHF4TFHWTG53M4YL6LL/ Code of Conduct: http://python.org/psf/codeofconduct/
