Re: Hive vs Pig against number of files spawned

Navis류승우 Tue, 01 Apr 2014 00:00:13 -0700

try
hive.hadoop.supports.splittable.combineinputformat=true;

Thanks,
Navis


2014-04-01 15:55 GMT+09:00 Sreenath <sreenaths1...@gmail.com>:
> Hi all,
> I have a partitioned table in hive where each partition will have 630 gzip
> compressed files each of average size 100kb. If I query over these files
> using hive it will generate exactly 630 mappers i.e one mapper for one file.
> Now as an experiment i tried reading those files with pig and pig actually
> combined the files and spawned only 2 mappers and the operation was much
> faster than hive.
> Why is there a difference in execution style of pig and hive? In hive can we
> similarly combine small files to spawn less mappers?

Re: Hive vs Pig against number of files spawned

Reply via email to