New submission from Evgeny Kapun: GzipFile's underlying stream can be a raw stream (such as FileIO), and such streams can return short reads and writes at any time (e.g. due to signals). The correct behavior in case of short read or write is to retry the call to read or write the remaining data.
GzipFile doesn't do this. This program demonstrates the problem with reading: import io, gzip class MyFileIO(io.FileIO): def read(self, n): # Emulate short read return super().read(1) raw = MyFileIO('test.gz', 'rb') gzf = gzip.open(raw, 'rb') gzf.read() Output: $ gzip -c /dev/null > test.gz $ python3 test.py Traceback (most recent call last): File "test.py", line 10, in <module> gzf.read() File "/usr/lib/python3.5/gzip.py", line 274, in read return self._buffer.read(size) File "/usr/lib/python3.5/gzip.py", line 461, in read if not self._read_gzip_header(): File "/usr/lib/python3.5/gzip.py", line 409, in _read_gzip_header raise OSError('Not a gzipped file (%r)' % magic) OSError: Not a gzipped file (b'\x1f') And this shows the problem with writing: import io, gzip class MyIO(io.RawIOBase): def write(self, data): print(data) # Emulate short write return 1 raw = MyIO() gzf = gzip.open(raw, 'wb') gzf.close() Output: $ python3 test.py b'\x1f\x8b' b'\x08' b'\x00' b'\xb9\xea\xffW' b'\x02' b'\xff' b'\x03\x00' b'\x00\x00\x00\x00' b'\x00\x00\x00\x00' It can be seen that there is no attempt to write all the data. Indeed, the return value of write() method is completely ignored. I think that either gzip module should be changed to handle short reads and writes properly, or its documentation should reflect the fact that it cannot be used with raw streams. ---------- components: Library (Lib) messages: 278606 nosy: abacabadabacaba priority: normal severity: normal status: open title: GzipFile doesn't properly handle short reads and writes on the underlying stream type: behavior versions: Python 3.5 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue28436> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com