Nick Czeczulin added the comment:

The spec allows for multi-member files. Some libraries and utilities seem to 
solve this problem (incorrectly?) by simply ignoring everything past the first 
member -- even when valid (e.g., DotNetZip, 7-Zip)

For 2.7 and 3.4, the data that has been decompressed but not yet read before 
the exception was raised is still available:

Modifying Martin's example slightly:

>>> f = BytesIO()
>>> with GzipFile(fileobj=f, mode="wb") as z:
...     z.write(b"data")
...
4
>>> f.write(b"garbage")
7
>>> f.seek(0)
0
>>> with GzipFile(fileobj=f, mode="rb") as z:
...     try:
...         z.read(1)
...         z.read()
...     except OSError as e:
...         z.extrabuf[z.offset - z.extrastart:]
...         e
...
b'd'
b'ata'
OSError('Not a gzipped file',)

My issue is that catching and handling this specific exception is a little more 
involved because there are 3(?) different OSErrors (IOError on 2.7) that could 
potentially be raised during the read. But mostly:
OSError('CRC check failed 0x447ba3f9 != 0x225cb2a3',) -- would be bad one to 
mistake for it.

Maybe a specific Exception type to catch for an invalid header, and a better 
method to read the remaining buffer when handling it?

----------
nosy: +nczeczulin

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue24301>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to