[issue4862] utf-16 BOM is not skipped after seek(0)

2009-03-04 Thread STINNER Victor
STINNER Victor added the comment: @benjamin: ok, great. ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pyth

[issue4862] utf-16 BOM is not skipped after seek(0)

2009-03-04 Thread Benjamin Peterson
Benjamin Peterson added the comment: Ah, I forgot this wasn't applied to the Python implementation. Fixed in r70184. ___ Python tracker ___ ___

[issue4862] utf-16 BOM is not skipped after seek(0)

2009-03-04 Thread STINNER Victor
STINNER Victor added the comment: > This has been fixed by the io-c branch merge. Can you at least include the patch to test_io.py from amaury's patch? And why not fixing the Python version of the io module (i'm not sure of the new name: _pyio?) since we have a working patch? ___

[issue4862] utf-16 BOM is not skipped after seek(0)

2009-03-04 Thread Benjamin Peterson
Benjamin Peterson added the comment: This has been fixed by the io-c branch merge. -- nosy: +benjamin.peterson resolution: -> fixed status: open -> closed ___ Python tracker ___

[issue4862] utf-16 BOM is not skipped after seek(0)

2009-01-19 Thread STINNER Victor
STINNER Victor added the comment: I opened a different issue (#5006) for the duplicate BOM in append mode. ___ Python tracker ___ ___ Python-b

[issue4862] utf-16 BOM is not skipped after seek(0)

2009-01-19 Thread Antoine Pitrou
Antoine Pitrou added the comment: I support Amaury's suggestion (actually I implemented it in the io-c branch). Resetting the decoder when seeking to the beginning of the stream is a reasonable way to deal with those incremental decoders for which the start state is something else than (b"", 0).

[issue4862] utf-16 BOM is not skipped after seek(0)

2009-01-08 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 2009-01-07 01:21, Amaury Forgeot d'Arc wrote: > First write a utf-16 file with its signature: > f1 = open('utf16.txt', 'w', encoding='utf-16') f1.write('0123456789') f1.close() > > Then read it twice: > f2 = open('utf16.txt', 'r', e

[issue4862] utf-16 BOM is not skipped after seek(0)

2009-01-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: Well, there are other problems with utf-16, e.g. when opening an existing file for appending, the BOM is written again: >>> f = open('utf16.txt', 'w', encoding='utf-16') >>> f.write('abc') 3 >>> f.close() >>> f = open('utf16.txt', 'a', encoding='utf-16') >>> f.

[issue4862] utf-16 BOM is not skipped after seek(0)

2009-01-07 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: > The problem is maybe that TextIOWrapper._pack_cookie() > can create a cookie=0 But only when position==0. And in this case, at the beginning of the stream, it makes sense to reset everything to its initial value: zero for the various counts, and call d

[issue4862] utf-16 BOM is not skipped after seek(0)

2009-01-07 Thread STINNER Victor
STINNER Victor added the comment: > This is because the zero in seek(0) is a "cookie" > which contains both the position and the decoder state. > Unfortunately, state=0 means 'endianness has been determined: > native order'. The problem is maybe that TextIOWrapper._pack_cookie() can create a

[issue4862] utf-16 BOM is not skipped after seek(0)

2009-01-07 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +haypo ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.o

[issue4862] utf-16 BOM is not skipped after seek(0)

2009-01-06 Thread Amaury Forgeot d'Arc
New submission from Amaury Forgeot d'Arc : First write a utf-16 file with its signature: >>> f1 = open('utf16.txt', 'w', encoding='utf-16') >>> f1.write('0123456789') >>> f1.close() Then read it twice: >>> f2 = open('utf16.txt', 'r', encoding='utf-16') >>> print('read1', ascii(f2.read())) read