On Wed, Jan 8, 2014 at 4:41 PM, Stefan Bodewig <bode...@apache.org> wrote: > Hi, > > putting the exact representation of an archive entry aside I've put down > an idea of the API for reading and writing archives together with a POC > port of the AR classes for this API. All is inside > http://svn.apache.org/repos/asf/commons/proper/compress/branches/compress-2.0/ > > The port doesn't look pretty but I wanted to get there quickly and > change as little as possible, partly to see how much effort porting the > existing code base would be. In particular I copied IOUtils into the AR > package so I don't have to thing about a proper package right now. I > also didn't care about Java < 7 so far. > > Please have a look (more on the interfaces than the actual > implementation) and show me how wrong I am :-) > > Some points I'd like to highlight and discuss: > > * ArchiveInput and ArchiveOutput are not Streams (or Channels) > themselves > > This is unline Archive*Stream in 1.x > > Emmanuel brought this up in a chat between the two of us and I agreed > with him. You don't really use them as a stream but rather as a > stream per entry. > > For Compressor* I'd still wrap streams/channels, different issue. > > * Using Channels rather than Streams > > I'm a bit torn about this. I did so because I'd prefer to base > ZipFile and friends on SeekableByteStream rather than RandomAccessFile > - so it would make the API look more symmetric. > > Drawbacks I've already found > > - no skip in ReadableByteChannel so you are forced to read data even > if something more efficient could be done. This smells like another > IOUtils method. > > - worse, no mark/reset or pushback, this is going to make format > detection uglier as we have to rewind the channel in a different way > > Another concern might be that Compress 2.0 might get delayed because > proting effort was bigger - I've deliberately taken the Channels.new* > route to wrap the existing stream based API in ArArchiveInput and it > seems to work (although likely is suboptimal). Going all-in on > Channels in ArArchiveOutput didn't look much more difficult either, > but the I/O part of output is simpler anyway. > > * Checked vs Unchecked exceptions > > I would love to make ArchiveInput be an Iterator over the entries but > can't do so as the things we'd need to do in next() might throw an > IOException. One option may be to introduce an unchecked > ArchiveException and wrap al checked exceptions (and do so throughout > the API).
Doesn't sound very appealing. > * RandomAccessArchiveInput as a generalization of ZipFile > > This extends ArchiveInput so if you ask for an ArchiveInput to a file > and the format doesn't support a stream-like interface (like 7z) you > can still obtain one. This is helped a lot by the fact that > ArchiveInput is not a stream itself. > > * I'm not sure about ArchiveInput#getChannel > > Should next return a Pair of ArchiveEntry and Channel instead? I don't think so, you might not want to look at an ArchiveEntry's contents, or it might be empty. > * tiny change to the contract of ArchiveOutput finish > > finish used to throw an exception if you didn't call closeEntry for > the last entry while putEntry closes the previous entry. This looked > inconsistent and finish now silently closes the entry as well. > > Stefan Damjan --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org