Michael Gray wrote: > I believe I've found a bug regarding the decompression of single-entry > .zip files. > > As per the Section V.C of theĀ .ZIP File Format Specification > (http://www.pkware.com/documents/casestudies/APPNOTE.TXT), data > descriptors (called the extended local header in the gzip source) *may > or may not* be preceded by a signature. Gzip always assumes this > signature is present; if it is not, it reads the CRC and length values > 4 bytes further into the file than it should, and the CRC and length > checks fail even though the file is not corrupt. > > I've included a patch that works around the problem. First, it assumes > that the signature *is not present*, as it's possible that the > signature value is also a valid CRC, and no non-corrupt file should be > rejected if it's CRC just happens to match the signature value. If the > CRC or length check fails assuming the signature is not present, the > signature is then checked for. If present, 4 more bytes of input are > read, and the previously read values are shifted appropriately. The > CRC and length checks then proceed as normal. > > Below is the text of the relevant text from the .ZIP spec: > " > > Although not originally assigned a signature, the value > 0x08074b50 has commonly been adopted as a signature value > for the data descriptor record. Implementers should be > aware that ZIP files may be encountered with or without this > signature marking data descriptors and should account for > either case when reading ZIP files to ensure compatibility. > When writing ZIP files, it is recommended to include the > signature value marking the data descriptor record. When > the signature is used, the fields currently defined for > the data descriptor record will immediately follow the > signature. > > " > > -- Michael Gray > > mike...@gmail.com
Thank you for the analysis and patch. However, I haven't looked at it yet, in case I have to rewrite it based solely on your description. Can you point to tools that produce ZIP files without that signature? Assuming that we find a few that are still in non-trivial use, we'll need a copyright assignment, since your patch is large enough to require that. Can you sign one? If so, here are some details: [that link is for the coreutils package, but it's the same policy for gzip] http://git.sv.gnu.org/cgit/coreutils.git/tree/HACKING#n444 Note: if you're in the US, you should be able to fax the signed form rather than using actual stamp and envelope. Jim