New submission from Peter <p.j.a.c...@googlemail.com>: Consider the following example where I have a gzipped text file,
$ python3 Python 3.2 (r32:88445, Feb 28 2011, 17:04:33) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import gzip >>> with gzip.open("ex1.sam.gz") as handle: ... line = handle.readline() ... >>> line b'EAS56_57:6:190:289:82\t69\tchr1\t100\t0\t*\t=\t100\t0\tCTCAAGGTTGTTGCAAGGGGGTCTATGTGAACAAA\t<<<7<<<;<<<<<<<<8;;<7;4<;<;;;;;94<;\tMF:i:192\n' Notice the file was opened in binary mode ("rb" is the default for gzip.open which is surprising given "t" is the default for open on Python 3), and a byte string is returned. Now try explicitly using non-binary reading "r", and again you get bytes rather than a (unicode) string as I would expect: >>> with gzip.open("ex1.sam.gz", "r") as handle: ... line = handle.readline() ... >>> line b'EAS56_57:6:190:289:82\t69\tchr1\t100\t0\t*\t=\t100\t0\tCTCAAGGTTGTTGCAAGGGGGTCTATGTGAACAAA\t<<<7<<<;<<<<<<<<8;;<7;4<;<;;;;;94<;\tMF:i:192\n' Now try and use "t" or "rt" to be even more explicit that text mode is desired, >>> with gzip.open("ex1.sam.gz", "t") as handle: ... line = handle.readline() ... Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/pjcock/lib/python3.2/gzip.py", line 46, in open return GzipFile(filename, mode, compresslevel) File "/Users/pjcock/lib/python3.2/gzip.py", line 157, in __init__ fileobj = self.myfileobj = builtins.open(filename, mode or 'rb') ValueError: can't have text and binary mode at once >>> with gzip.open("ex1.sam.gz", "rt") as handle: ... line = handle.readline() ... Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/pjcock/lib/python3.2/gzip.py", line 46, in open return GzipFile(filename, mode, compresslevel) File "/Users/pjcock/lib/python3.2/gzip.py", line 157, in __init__ fileobj = self.myfileobj = builtins.open(filename, mode or 'rb') ValueError: can't have text and binary mode at once See also Issue #5148 which is perhaps somewhat related. ---------- components: None messages: 153067 nosy: maubp priority: normal severity: normal status: open title: gzip always returns byte strings, no text mode versions: Python 3.2 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue13989> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com