> ZipFile relies on RandomAccessFile so any archive can't be bigger than > the maximum size supported by RandomAccessFile. In particular the seek > method expects a long as argument so the hard limit would be an archive > size of 2^63-1 bytes. In practice I expect RandomAccessFile to not > support files that big on many platforms.
Yeah ... let's cross that bridge when people complain ;) > For the streaming mode offsets are currently stored as longs but that > could be changed to BigIntegers easily so we could reach 2^64-1 at the > expense of memory consumption and maybe even some performance issues > (the offsets are not really used in calculations so I don't expect any > major impact). No insights on the implementation but that might be worth changing so it's in line with the ZipFile impl > Size of an individual entry (compressed or not) > =============================================== > > The format supports an unsigned 64 bit integer as size, ArchiveEntry's > get/setSize methods use long - this means there is a factor of 2. > > We could easily add an additional setter/getter for size that uses > BigInteger, the infrastructure to support it would be there. OTOH it is > questionable whether we'd support anything > Long.MAX_VALUE in practice > because of the previous point anyway. Especially as this also just for one individual entry. Again - I think I would not bother at this stage. Nothing that cannot be added later. > Number of files entries the archive > =================================== > > This used to be an unsingned 16 bit integer and has grown to an > unsigned 64 bit integer with ZIP64. > > ZipArchiveInputStream should work with arbitrary many entries. > > ZipArchiveOutputStream uses a LinkedList to store all entries as it has > to keep track of the metadata in order to write the central directory. > It also uses an additional HashMap that could be removed easily by > storing the data together with the entries themselves. LinkedList won't > allow more than Integer.MAX_VALUE entries which leaves us quite a bit > away from the theoretical limit of the format. Hmmm. > I'm confident that even I would manage to write an efficient singly > linked list that is only ever appended to and that is iterated over > exactly once from head to tail. +1 for that then :) > I don't see myself writing an efficient map > with a capacity of Long.MAX_VALUE or bigger, either. There must be something like that out there already. Otherwise it could be another nice addition to Collections ;) > We could stick with documenting the limits of ZipFile properly. In > practice I doubt many people will have to deal with archives of 2^63 > bytes or more. And even archives with 2^32 entries or more should be > rare - in which case people could fall back to ZipArchiveInputStream. Hm. Yeah ...maybe just get it out before we start implementing new collection classes. Cool stuff!! cheers, Torsten --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org