New submission from STINNER Victor <victor.stin...@haypocalc.com>:

The following code fails with an AssertionError('###\ufeffdef'):

import codecs
_open = codecs.open
#_open = open
filename = "test"
with _open(filename, 'w', encoding='utf_16') as f:
    f.write('abc')
    pos = f.tell()
with _open(filename, 'w', encoding='utf_16') as f:
    f.seek(pos)
    f.write('def')
    f.seek(0)
    f.write('###')
with _open(filename, 'r', encoding='utf_16') as f:
    content = f.read()
    assert content == '###def', ascii(content)

It is a bug in StreamWriter.seek(): it should update the encoder state to not 
write a new BOM. It has to be fixed in the StreamWriter class of each stateful 
codec, or a stateful StreamWriter class should be implemented in the codecs 
module.

Python supports the following stateful codecs:

 * cp932
 * cp949
 * cp950
 * euc_jis_2004
 * euc_jisx2003
 * euc_jp
 * euc_kr
 * gb18030
 * gbk
 * hz
 * iso2022_jp
 * iso2022_jp_1
 * iso2022_jp_2
 * iso2022_jp_2004
 * iso2022_jp_3
 * iso2022_jp_ext
 * iso2022_kr
 * shift_jis
 * shift_jis_2004
 * shift_jisx0213
 * utf_8_sig
 * utf_16
 * utf_32

This bug has already been fixed in TextIOWrapper: issue #5006.

----------
messages: 139969
nosy: haypo
priority: normal
severity: normal
status: open
title: codecs: StreamWriter issue with stateful codecs after a seek
versions: Python 2.7, Python 3.2, Python 3.3

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12512>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to