try hive.hadoop.supports.splittable.combineinputformat=true; Thanks, Navis
2014-04-01 15:55 GMT+09:00 Sreenath <sreenaths1...@gmail.com>: > Hi all, > I have a partitioned table in hive where each partition will have 630 gzip > compressed files each of average size 100kb. If I query over these files > using hive it will generate exactly 630 mappers i.e one mapper for one file. > Now as an experiment i tried reading those files with pig and pig actually > combined the files and spawned only 2 mappers and the operation was much > faster than hive. > Why is there a difference in execution style of pig and hive? In hive can we > similarly combine small files to spawn less mappers?