For some reason, I can't decrease the number of mappers in Hive (0.12) and Hadoop 2.2. I believe I was able to do that in 0.10.
My table has 170K rows and 2000 small (20KB) uncompressed files (I'll try to make Hive merge these small files in the future). The relevant Hive settings are below: hive> SET hive.input.format; hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat hive> SET mapreduce.input.fileinputformat.split.maxsize; mapreduce.input.fileinputformat.split.maxsize=1073741824 hive> SET hive.hadoop.supports.splittable.combineinputformat; hive.hadoop.supports.splittable.combineinputformat=true hive> SET mapred.max.split.size; mapred.max.split.size=1073741824 When I run select count(1), I get 658 mappers (one for every 3 files?): Hadoop job information for Stage-1: number of mappers: 658; number of reducers: 1 The table is regular and uncompressed: # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat: org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No What am I missing? Thanks!