Make SplitCompressionInputStream an interface instead of an abstract class
--------------------------------------------------------------------------

                 Key: HADOOP-8003
                 URL: https://issues.apache.org/jira/browse/HADOOP-8003
             Project: Hadoop Common
          Issue Type: New Feature
          Components: io
    Affects Versions: 1.0.0, 0.23.0, 0.22.0, 0.21.0
            Reporter: Tim Broberg


To be splittable, a codec must extend SplittableCompressionCodec which has a 
function returning a SplitCompressionInputStream.

SplitCompressionInputStream is an abstract class which extends 
CompressionInputStream, the lowest level compression stream class.

So, no codec that wants to be splittable can reuse any code from 
DecompressorStream or BlockDecompressorStream.

You either have to duplicate that code, or not be splittable.

SplitCompressionInputStream adds just a few very thin functions. Can we make 
this an interface rather than an abstract class to allow splittable 
decompression streams to extend DecompressorStream, BlockDecompressorStream, or 
whatever else we should scheme up in the future?

To my knowledge, this would impact only the BZip2 codec. None of the other 
implement this form of splittability yet.

LineRecordReader looks only at whether the codec is an instance of 
SplittableCompressionCodec, and then calls the appropriate version of 
createInputStream. This would not change, so the application code should not 
have to change, just BZip and SplitCompressionInputStream.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to