On 8/5/11 9:40 PM, Stefan Bodewig wrote: > Hi, > > there are eight possible permutations of compressed/uncompressed entries > that get written to a seekable-/non-seekable stream whose size is either > known or unknown prior to writing them. > > One of them is prohibited (uncompressed/non-seekable/unknown size) and > has been prohibited before, no change here. > > For six of the remaining seven permutations ZipArchiveOutputStream > should be generating archives that transparently enable ZIP64 features > for entries if and only if they are too big to be stored without ZIP64. > I.e. the resulting archive will either be readable by an implementation > that doesn't support ZIP64 or it contains files that would be too big > for such an implementation anyway. The price we pay for some cases are > an additional 20 bytes per entry that are never used by anybody. > > The only case that isn't covered so far is compressed / non-seekable > output / input of unknown size. > > Such entries are stored using a feature that is called the "data > descriptor". There are two different formats of the data descriptor for > ZIP64 and not-ZIP64 archives and the archive writer has to signal which > type of descriptor it is going to write before it starts writing the > entry's data. > > This means ZipArchiveOutputStream must decide whether it is going to use > the ZIP64 format before it knows whether it would actually need it or > not. If it signals it is going to use ZIP64 then an implementation that > doesn't support ZIP64 (like Compress 1.2 or java.util.zip) may fail to > read the archive, which is bad if the entry turns out to be smaller than > 4GiB. If it doesn't signal ZIP64 it can't write big entries at all. > > This decision can be made at the granularity of a single entry. I.e. it > is possible to not use ZIP64 for the majority of entries and enable it > for individual entries. > > IMHO there is no right or wrong decision here that the library could > make. The user-code will have to decide whether ZIP64 should be enabled > or not. The main questions to me are whether we want to attach this > decision to the stream or the entry itself and what the default should > be.
Can you think of practical use cases where setting at the entry level is needed? > > InfoZIP's ZIP has decided to make it an option for the whole archive > (the command line doesn't offer much flexibility here) and make it > default to ZIP64. > > My current thinking is that java.util.zip is a likely candidate for the > receiving end of ZIPs we create, so it may be better to turn ZIP64 off > by default, but I'm not sure. > > I'm leaning towards adding a setUseZip64(boolean) method at the level of > ZipArchiveOutputStream and make it default to false. This method could > be called in between putArchiveEntry calls to make it apply selectively > to indiviual entries. Sounds reasonable. > > The name is totally open for debate since as it stands it sounds as if > you could turn off all Zip64 features which I wouldn't want to do for > the cases that can be dealt with transparently. Then again it could use > a Boolean argument with "null" meaning "do the best you can" and false > "don't even use Zip64 if you think it is safe". I don't get what you mean by "do the best you can." Does that mean turn it on when needed if somehow you know it is needed, per entry, I assume? Libraries that try to be too smart tend to be hard on both users and maintainers, so IIUC what is going on here, I would recommend KISS - simple boolean property. Phil > > Any ideas? > > Stefan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org