On Wed, Jan 8, 2014 at 4:41 PM, Stefan Bodewig <bode...@apache.org> wrote:
> Hi,
>
> putting the exact representation of an archive entry aside I've put down
> an idea of the API for reading and writing archives together with a POC
> port of the AR classes for this API.  All is inside
> http://svn.apache.org/repos/asf/commons/proper/compress/branches/compress-2.0/
>
> The port doesn't look pretty but I wanted to get there quickly and
> change as little as possible, partly to see how much effort porting the
> existing code base would be.  In particular I copied IOUtils into the AR
> package so I don't have to thing about a proper package right now.  I
> also didn't care about Java < 7 so far.
>
> Please have a look (more on the interfaces than the actual
> implementation) and show me how wrong I am :-)
>
> Some points I'd like to highlight and discuss:
>
> * ArchiveInput and ArchiveOutput are not Streams (or Channels)
>   themselves
>
>   This is unline Archive*Stream in 1.x
>
>   Emmanuel brought this up in a chat between the two of us and I agreed
>   with him.  You don't really use them as a stream but rather as a
>   stream per entry.
>
>   For Compressor* I'd still wrap streams/channels, different issue.
>
> * Using Channels rather than Streams
>
>   I'm a bit torn about this.  I did so because I'd prefer to base
>   ZipFile and friends on SeekableByteStream rather than RandomAccessFile
>   - so it would make the API look more symmetric.
>
>   Drawbacks I've already found
>
>   - no skip in ReadableByteChannel so you are forced to read data even
>     if something more efficient could be done.  This smells like another
>     IOUtils method.
>
>   - worse, no mark/reset or pushback, this is going to make format
>     detection uglier as we have to rewind the channel in a different way
>
>   Another concern might be that Compress 2.0 might get delayed because
>   proting effort was bigger - I've deliberately taken the Channels.new*
>   route to wrap the existing stream based API in ArArchiveInput and it
>   seems to work (although likely is suboptimal).  Going all-in on
>   Channels in ArArchiveOutput didn't look much more difficult either,
>   but the I/O part of output is simpler anyway.
>
> * Checked vs Unchecked exceptions
>
>   I would love to make ArchiveInput be an Iterator over the entries but
>   can't do so as the things we'd need to do in next() might throw an
>   IOException.  One option may be to introduce an unchecked
>   ArchiveException and wrap al checked exceptions (and do so throughout
>   the API).

Doesn't sound very appealing.

> * RandomAccessArchiveInput as a generalization of ZipFile
>
>   This extends ArchiveInput so if you ask for an ArchiveInput to a file
>   and the format doesn't support a stream-like interface (like 7z) you
>   can still obtain one.  This is helped a lot by the fact that
>   ArchiveInput is not a stream itself.
>
> * I'm not sure about ArchiveInput#getChannel
>
>   Should next return a Pair of ArchiveEntry and Channel instead?

I don't think so, you might not want to look at an ArchiveEntry's
contents, or it might be empty.

> * tiny change to the contract of ArchiveOutput finish
>
>   finish used to throw an exception if you didn't call closeEntry for
>   the last entry while putEntry closes the previous entry.  This looked
>   inconsistent and finish now silently closes the entry as well.
>
> Stefan

Damjan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to