If I explain the scenario in more detail then it might become clearer.
I am seeing issues with certain zip files and file format based on zip (such as docx and zip). We are reading these files from a stream so are using the ZipArchiveInputStream. What I see is that we loop around getting each entry with getNextZipEntry and we get a null and stop. All looks good. However we have only extracted 1 or 2 entries out of a known 20 or 30 entries - the file based extractor extracts all the file. I cannot provide an example of a file as the examples I have are all customer owned. However every xps file I have seen suffers the issue: http://www.microsoft.com/whdc/xps/xpssampdoc.mspx I have investigated the issue and it is caused by entries that use the central directory. What happens in the zip stream reader is that the size, csize and crc fields are all zero, there is no central directory available to the reader so it performs no extraction. This means the next loop to getNextZipEntry is incorrectly positioned and fails checking the entry signature (LFH_SIG), this returns a null and to the calling code it appears that we have succeeded. So my two change requests are simply to enable me to validate entries and detect these types of stream so I can do something appropriate. With compress 1.1 there is support to identify encrypted entries which I need and hence the request to identify entries using the data descriptor. The second request is to not return a null when this type of error occurs but indicate the error somehow. There might be issues here (I am no zip expert) but I would be worried about false errors being reported. Simon On 11/03/2010 13:11, "Stefan Bodewig" <bode...@apache.org> wrote: > On 2010-03-10, Simon Tyler <sty...@mimecast.net> wrote: > >> Do we have a date yet for the compress 1.1 release? > > What Christian said. > >> Also, is there time to add a couple of minor feature enhancements? I could >> do with access to the following: > >> 1. A public method to check if a ZipArchiveInputStream has a data >> descriptor (e.g. return hasDataDescriptor). > > This is a property of the individual entry, not the stream as a whole, > isn't it? Why would you want to know that (just curious)? We could > probably make the general purpose flags available and you could look at > bit 3. > >> 2. Better handling when ZipArchiveInputStream is used to read such streams. >> Currently it silently fails when this happens when if hits an invalid >> LFH_SIG by returning null. > > I'm not sure what you mean, could you describe what happens under what > circumstances in more detail? I see that the data descriptor isn't read > anywhere and I see that the stream may fail if the data descriptor uses > the "unofficial signature" mentioned in appnote, is this what you mean? > > Stefan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org