decode

Marc-Andre Lemburg Tue, 06 Oct 2015 14:30:02 -0700

Marc-Andre Lemburg added the comment:

Just to add some more background:


The LE and BE codecs are meant to be used when you already know the endianness 
of the platform you are targeting, e.g. in case you work on strings that were 
read after the initial BOM, or write to an output string in chunks after having 
written the initial BOM. As such, they don't treat the BOM special, since it is 
a valid code point, and pass it through as-is.

If you do want BOM handling, the UTF-16 codec is the right choice. It defaults 
to the platform's endianness and uses the BOM to indicate which choice it made.

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue25325>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue25325] UTF-16LE, UTF-16BE, UTF-32LE, and UTF-32BE encodings don't add/remove BOM on encode/decode

Reply via email to