Karthikeyan Singaravelan <tir.kar...@gmail.com> added the comment:
codecs.getreader('utf-8')(open('test.txt', 'rb')) during iteration str.splitlines on the decoded data that takes '\x0b' as a valid newline as specified in [0] being a superset of universal newlines. Thus splits on '\x0b' as a valid newline for string and works correctly. ./python.exe Python 3.8.0a0 (heads/master:6f85b826b5, Oct 4 2018, 22:44:36) [Clang 7.0.2 (clang-700.1.81)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> a = 'first line\x0b\x0bblah blah\nsecond line\n' # returned by >>> codecs.getreader() >>> a.splitlines(keepends=True) ['first line\x0b', '\x0b', 'blah blah\n', 'second line\n'] # for bytes bytes.splitlines works only on universal-newlines thus doesn't split on '\x0b' [1] >>> b = b'first line\x0b\x0bblah blah\nsecond line\n' >>> b.splitlines(keepends=True) [b'first line\x0b\x0bblah blah\n', b'second line\n'] But io.TextIOWrapper only accepts None, '', '\n', '\r\n' and '\r' as newline for text mode but for binary files it's different as noted in readline to accept only '\n' [2] > The line terminator is always b'\n' for binary files; for text > files, the newlines argument to open can be used to select the line > terminator(s) recognized. Thus 'first line\x0b\x0bblah blah\nsecond line\n' gives ['first line\x0b\x0bblah blah\n', 'second line\n'] . Trying to use '\x0b' as new line results in illegal newline error in TextIOWrapper. Hope I am correct on the above analysis. [0] https://docs.python.org/3.8/library/stdtypes.html#str.splitlines [1] https://docs.python.org/3.8/library/stdtypes.html#bytes.splitlines [2] https://docs.python.org/3/library/io.html#io.TextIOBase.readline ---------- nosy: +xtreak _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue34801> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com