New submission from Erick Tryzelaar: I was playing around with python 3's io functions, and I found that when trying to write to an encoded utf-16 file that TextIOWrapper.write re-writes the utf-16 bom for every string:
>>> f=open('foo', 'w', encoding='utf-16') >>> print('1234', file=f) >>> print('5678', file=f) >>> open('foo', 'rb').read() b'\xff\xfe1\x002\x003\x004\x00\xff\xfe\n\x00\xff\xfe5\x006\x007\x008\x00\xff\xfe\n\x00' >>> open('foo', 'r', encoding='utf-16').read() '1234\ufeff\n\ufeff5678\ufeff\n' >>> With the attached patch, it appears to generate the correct file: >>> f=open('foo', 'w', encoding='utf-16') >>> print('1234', file=f) >>> print('5678', file=f) >>> open('foo', 'rb').read() b'\xff\xfe1\x002\x003\x004\x00\n\x005\x006\x007\x008\x00\n\x00' >>> open('foo', 'r', encoding='utf-16').read() '1234\n5678\n' >>> ---------- components: Library (Lib) files: io.py.patch messages: 59438 nosy: erickt severity: normal status: open title: TextIOWrapper.write writes utf BOM for every string type: behavior versions: Python 3.0 Added file: http://bugs.python.org/file9091/io.py.patch __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1753> __________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com