On Wed, Nov 30, 2011 at 1:57 AM, Jim Meyering <j...@meyering.net> wrote: > Michael Gray wrote: >> I believe I've found a bug regarding the decompression of single-entry >> .zip files. >> >> As per the Section V.C of the .ZIP File Format Specification >> (http://www.pkware.com/documents/casestudies/APPNOTE.TXT), data >> descriptors (called the extended local header in the gzip source) *may >> or may not* be preceded by a signature. Gzip always assumes this >> signature is present; if it is not, it reads the CRC and length values >> 4 bytes further into the file than it should, and the CRC and length >> checks fail even though the file is not corrupt. >> >> I've included a patch that works around the problem. First, it assumes >> that the signature *is not present*, as it's possible that the >> signature value is also a valid CRC, and no non-corrupt file should be >> rejected if it's CRC just happens to match the signature value. If the >> CRC or length check fails assuming the signature is not present, the >> signature is then checked for. If present, 4 more bytes of input are >> read, and the previously read values are shifted appropriately. The >> CRC and length checks then proceed as normal. >> >> Below is the text of the relevant text from the .ZIP spec: >> " >> >> Although not originally assigned a signature, the value >> 0x08074b50 has commonly been adopted as a signature value >> for the data descriptor record. Implementers should be >> aware that ZIP files may be encountered with or without this >> signature marking data descriptors and should account for >> either case when reading ZIP files to ensure compatibility. >> When writing ZIP files, it is recommended to include the >> signature value marking the data descriptor record. When >> the signature is used, the fields currently defined for >> the data descriptor record will immediately follow the >> signature. >> >> " >> >> -- Michael Gray >> >> mike...@gmail.com > > Thank you for the analysis and patch. > However, I haven't looked at it yet, in case I have to > rewrite it based solely on your description. > > Can you point to tools that produce ZIP files without that signature? > > Assuming that we find a few that are still in non-trivial use, > we'll need a copyright assignment, since your patch is large enough to > require that. Can you sign one? If so, here are some details: > [that link is for the coreutils package, but it's the same policy for gzip] > > http://git.sv.gnu.org/cgit/coreutils.git/tree/HACKING#n444 > > Note: if you're in the US, you should be able to fax the signed form > rather than using actual stamp and envelope. > > Jim
Apparently my emails to the copyright clerk ended up in the wrong queue and got delayed for two months; I received the copyright assignment form, signed, and submitted it just now. -- Michael