New submission from Peter: Regression in Python 3.3.0 to 3.3.1, tested under Mac OS X 10.8 and CentOS Linux 64bit.
The same regression also present in going from Python 2.7.3 from 2.7.4, does that need a separate issue filed? Consider this VALID GZIP file, human link: https://github.com/biopython/biopython/blob/master/Tests/GenBank/cor6_6.gb.bgz Binary link, only a small file: https://raw.github.com/biopython/biopython/master/Tests/GenBank/cor6_6.gb.bgz This is compressed using a GZIP variant called BGZF which uses multiple blocks and records additional tags in the header, for background see: http://blastedbio.blogspot.com/2011/11/bgzf-blocked-bigger-better-gzip.html $ curl -O https://raw.github.com/biopython/biopython/master/Tests/GenBank/cor6_6.gb.bgz $ cat cor6_6.gb.bgz | gunzip | wc 320 1183 14967 Now for the bug, expected behaviour: $ python3.2 Python 3.2 (r32:88445, Feb 28 2011, 17:04:33) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import gzip >>> handle = gzip.open("cor6_6.gb.bgz", "rb") >>> data = handle.read() >>> handle.close() >>> len(data) 14967 >>> quit() Broken behaviour: $ python3.3 Python 3.3.1 (default, Apr 8 2013, 17:54:08) [GCC 4.2.1 Compatible Apple Clang 4.0 ((tags/Apple/clang-421.0.57))] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import gzip >>> handle = gzip.open("cor6_6.gb.bgz", "rb") >>> data = handle.read() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/pjcock/lib/python3.3/gzip.py", line 359, in read while self._read(readsize): File "/Users/pjcock/lib/python3.3/gzip.py", line 432, in _read if not self._read_gzip_header(): File "/Users/pjcock/lib/python3.3/gzip.py", line 305, in _read_gzip_header self._read_exact(struct.unpack("<H", self._read_exact(2))) File "/Users/pjcock/lib/python3.3/gzip.py", line 282, in _read_exact data = self.fileobj.read(n) File "/Users/pjcock/lib/python3.3/gzip.py", line 81, in read return self.file.read(size) TypeError: integer argument expected, got 'tuple' The bug is very simple, an error in line 205 of gzip.py: 203 if flag & FEXTRA: 204 # Read & discard the extra field, if present 205 self._read_exact(struct.unpack("<H", self._read_exact(2))) The struct.unpack method returns a single element tuple, thus a fix is: 203 if flag & FEXTRA: 204 # Read & discard the extra field, if present 205 extra_len, = struct.unpack("<H", self._read_exact(2)) 206 self._read_exact(extra_len) This bug was identified via failing Biopython unit tests under Python 2.7.4 and 3.3.1, which all pass with this minor fix applied. ---------- components: Library (Lib) messages: 186320 nosy: maubp priority: normal severity: normal status: open title: Extra gzip headers breaks _read_gzip_header versions: Python 3.3 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue17666> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com