Re: read .gz files

Robert Metzger Thu, 19 Feb 2015 12:45:25 -0800

I just had a look at Hadoop's TextInputFormat.
In hadoop-common-2.2.0.jar there are the following compression codecs
contained:


org.apache.hadoop.io.compress.BZip2Codec
org.apache.hadoop.io.compress.DefaultCodec
org.apache.hadoop.io.compress.DeflateCodec
org.apache.hadoop.io.compress.GzipCodec
org.apache.hadoop.io.compress.Lz4Codec
org.apache.hadoop.io.compress.SnappyCodec

(See also CompressionCodecFactory). So you should be good to go.


On Thu, Feb 19, 2015 at 9:31 PM, Robert Metzger <rmetz...@apache.org> wrote:

> Hi,
>
> right now Flink itself has only support for reading ".deflate" files. Its
> basically the same algorithm as gzip but gzip files seem to have some
> header which makes the two formats incompatible.
>
> But you can easily use HadoopInputFormats with Flink. I'm sure there is a
> Hadoop IF for reading gzip'ed files.
>
>
> Best,
> Robert
>
>
> On Thu, Feb 19, 2015 at 9:25 PM, Sebastian <ssc.o...@googlemail.com>
> wrote:
>
>> Hi,
>>
>> does flink support reading gzipped files? Haven't found any info about
>> this on the website.
>>
>> Best,
>> Sebastian
>>
>
>

Re: read .gz files

Reply via email to