New submission from Thomas Guettler:

The stream reader of codecs.open() breaks on undocumented characters:

http://docs.python.org/2/library/codecs.html?highlight=codecs%20readline#codecs.StreamReader.readline

import tempfile
temp=tempfile.mktemp()
fd=open(temp, 'wb')
fd.write('abc\ndef\x85ghi')
fd.close()

import codecs
fd=codecs.open(temp, 'rb', 'latin1')
while True:
    line=fd.readline()
    if not line:
        break
    print repr(line)

Result:
u'abc\n'
u'def\x85'
u'ghi'

Related: 
http://stackoverflow.com/questions/16227114/utf-8-files-read-in-python-will-line-break-at-character-x85

----------
assignee: docs@python
components: Documentation
messages: 192112
nosy: docs@python, guettli
priority: normal
severity: normal
status: open
title: codecs: StremReader readline() breaks on undocumented characters
versions: Python 2.7

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18337>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to