Re: Support for zipped input files

Ken Weiner Tue, 10 Mar 2009 09:43:30 -0700

Thanks very much, Tom.  You saved me a lot of time by confirming that it
isn't available yet.  I'll go vote for HADOOP-1824.


On Tue, Mar 10, 2009 at 3:23 AM, Tom White <t...@cloudera.com> wrote:

> Hi Ken,
>
> Unfortunately, Hadoop doesn't yet support MapReduce on zipped files
> (see https://issues.apache.org/jira/browse/HADOOP-1824), so you'll
> need to write a program to unzip them and write them into HDFS first.
>
> Cheers,
> Tom
>
> On Tue, Mar 10, 2009 at 4:11 AM, jason hadoop <jason.had...@gmail.com>
> wrote:
> > Hadoop has support for S3, the compression support is handled at another
> > level and should also work.
> >
> >
> > On Mon, Mar 9, 2009 at 9:05 PM, Ken Weiner <k...@gumgum.com> wrote:
> >
> >> I have a lot of large zipped (not gzipped) files sitting in an Amazon S3
> >> bucket that I want to process.  What is the easiest way to process them
> >> with
> >> a Hadoop map-reduce job?  Do I need to write code to transfer them out
> of
> >> S3, unzip them, and then move them to HDFS before running my job, or
> does
> >> Hadoop have support for processing zipped input files directly from S3?
> >>
> >
> >
> >
> > --
> > Alpha Chapters of my book on Hadoop are available
> > http://www.apress.com/book/view/9781430219422
> >
>

Re: Support for zipped input files

Reply via email to