Re: CombineHiveInputFormat and Merge files not working for compressed text files

2011-11-30 Thread Igor Tatarinov
I might be wrong but I think EMR inserts a reduce job when writing data into S3. At least in my case, I am able to create a single output file by SET mapred.reduce.tasks = 1; INSERT OVERWRITE TABLE price_history_s3 ... Without using any a combined format. The number of mappers _is_ determined by

CombineHiveInputFormat and Merge files not working for compressed text files

2011-11-29 Thread Mohit Gupta
Hi All, I am using hive 0.7 on Amazon EMR. I need to merge a large number of small files into a few larger files( basically merging a number of partitions for a table into one). On doing the obvious query, i.e.( insert into a new partition select * from all partitions), a large number of small file