Support for zipped input files

Ken Weiner Mon, 09 Mar 2009 21:06:30 -0700

I have a lot of large zipped (not gzipped) files sitting in an Amazon S3
bucket that I want to process.  What is the easiest way to process them with
a Hadoop map-reduce job?  Do I need to write code to transfer them out of
S3, unzip them, and then move them to HDFS before running my job, or does
Hadoop have support for processing zipped input files directly from S3?

Support for zipped input files

Reply via email to