Hi,
putting the exact representation of an archive entry aside I've put down
an idea of the API for reading and writing archives together with a POC
port of the AR classes for this API. All is inside
http://svn.apache.org/repos/asf/commons/proper/compress/branches/compress-2.0/
The port doesn't look pretty but I wanted to get there quickly and
change as little as possible, partly to see how much effort porting the
existing code base would be. In particular I copied IOUtils into the AR
package so I don't have to thing about a proper package right now. I
also didn't care about Java < 7 so far.
Please have a look (more on the interfaces than the actual
implementation) and show me how wrong I am :-)
Some points I'd like to highlight and discuss:
* ArchiveInput and ArchiveOutput are not Streams (or Channels)
themselves
This is unline Archive*Stream in 1.x
Emmanuel brought this up in a chat between the two of us and I agreed
with him. You don't really use them as a stream but rather as a
stream per entry.
For Compressor* I'd still wrap streams/channels, different issue.
* Using Channels rather than Streams
I'm a bit torn about this. I did so because I'd prefer to base
ZipFile and friends on SeekableByteStream rather than RandomAccessFile
- so it would make the API look more symmetric.
Drawbacks I've already found
- no skip in ReadableByteChannel so you are forced to read data even
if something more efficient could be done. This smells like another
IOUtils method.
- worse, no mark/reset or pushback, this is going to make format
detection uglier as we have to rewind the channel in a different way
Another concern might be that Compress 2.0 might get delayed because
proting effort was bigger - I've deliberately taken the Channels.new*
route to wrap the existing stream based API in ArArchiveInput and it
seems to work (although likely is suboptimal). Going all-in on
Channels in ArArchiveOutput didn't look much more difficult either,
but the I/O part of output is simpler anyway.
* Checked vs Unchecked exceptions
I would love to make ArchiveInput be an Iterator over the entries but
can't do so as the things we'd need to do in next() might throw an
IOException. One option may be to introduce an unchecked
ArchiveException and wrap al checked exceptions (and do so throughout
the API).
* RandomAccessArchiveInput as a generalization of ZipFile
This extends ArchiveInput so if you ask for an ArchiveInput to a file
and the format doesn't support a stream-like interface (like 7z) you
can still obtain one. This is helped a lot by the fact that
ArchiveInput is not a stream itself.
* I'm not sure about ArchiveInput#getChannel
Should next return a Pair of ArchiveEntry and Channel instead?
* tiny change to the contract of ArchiveOutput finish
finish used to throw an exception if you didn't call closeEntry for
the last entry while putEntry closes the previous entry. This looked
inconsistent and finish now silently closes the entry as well.
Stefan
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]