Marc-Andre Lemburg added the comment: Just to add some more background:
The LE and BE codecs are meant to be used when you already know the endianness of the platform you are targeting, e.g. in case you work on strings that were read after the initial BOM, or write to an output string in chunks after having written the initial BOM. As such, they don't treat the BOM special, since it is a valid code point, and pass it through as-is. If you do want BOM handling, the UTF-16 codec is the right choice. It defaults to the platform's endianness and uses the BOM to indicate which choice it made. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue25325> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com