I have a files of 7G, and the load using the command of load data local inpath '/home/oracle/store_sales.csv' into table store_sales;
That file is not compressed, so I want to compress the table to make it work faster ( I don't know how to let hive work on a compress file directly), So I use the command create table test as select * from store_sales, in this way it create 113 files compressed in snappy. and each file is of size less than 10M (might because one snapply file is the result of compressing one block of HDFS), then is run any query, it always kick 113 map tasks. Since the cluster only has 3 nodes, so I need to let it run only 3 map task. I set mapred.min.split.size to 350M (total size of compressed files are less than 1G, so 350M*3 > 1G), but it still kicks off 113 map tasks. What parameter I need to enable to make it run 3 map tasks? -rw-r--r-- 3 oracle supergroup 10156524 2011-08-27 17:06 /user/hive/warehouse/test/000000_0.snappy -rw-r--r-- 3 oracle supergroup 10063292 2011-08-27 17:06 /user/hive/warehouse/test/000001_0.snappy -rw-r--r-- 3 oracle supergroup 10057315 2011-08-27 17:06 /user/hive/warehouse/test/000002_0.snappy -rw-r--r-- 3 oracle supergroup 10016039 2011-08-27 17:06 /user/hive/warehouse/test/000003_0.snappy -rw-r--r-- 3 oracle supergroup 9845530 2011-08-27 17:06 /user/hive/warehouse/test/000004_0.snappy -rw-r--r-- 3 oracle supergroup 9819626 2011-08-27 17:06 /user/hive/warehouse/test/000005_0.snappy -rw-r--r-- 3 oracle supergroup 9801408 2011-08-27 17:07 /user/hive/warehouse/test/000006_0.snappy -rw-r--r-- 3 oracle supergroup 9776102 2011-08-27 17:07 /user/hive/warehouse/test/000007_0.snappy -rw-r--r-- 3 oracle supergroup 9772285 2011-08-27 17:07 /user/hive/warehouse/test/000008_0.snappy -rw-r--r-- 3 oracle supergroup 9764841 2011-08-27 17:07 /user/hive/warehouse/test/000009_0.snappy -rw-r--r-- 3 oracle supergroup 9738481 2011-08-27 17:07 /user/hive/warehouse/test/000010_0.snappy -rw-r--r-- 3 oracle supergroup 9694980 2011-08-27 17:07 /user/hive/warehouse/test/000011_0.snappy -rw-r--r-- 3 oracle supergroup 9663682 2011-08-27 17:07 /user/hive/warehouse/test/000012_0.snappy -rw-r--r-- 3 oracle supergroup 9643515 2011-08-27 17:07 /user/hive/warehouse/test/000013_0.snappy -rw-r--r-- 3 oracle supergroup 9634152 2011-08-27 17:07 /user/hive/warehouse/test/000014_0.snappy -rw-r--r-- 3 oracle supergroup 9631661 2011-08-27 17:07 /user/hive/warehouse/test/000015_0.snappy -rw-r--r-- 3 oracle supergroup 9625304 2011-08-27 17:07 /user/hive/warehouse/test/000016_0.snappy -rw-r--r-- 3 oracle supergroup 9617673 2011-08-27 17:07 /user/hive/warehouse/test/000017_0.snappy -rw-r--r-- 3 oracle supergroup 9612474 2011-08-27 17:08 /user/hive/warehouse/test/000018_0.snappy -rw-r--r-- 3 oracle supergroup 9608600 2011-08-27 17:08 /user/hive/warehouse/test/000019_0.snappy -rw-r--r-- 3 oracle supergroup 9600738 2011-08-27 17:08 /user/hive/warehouse/test/000020_0.snappy -rw-r--r-- 3 oracle supergroup 9555315 2011-08-27 17:08 /user/hive/warehouse/test/000021_0.snappy -rw-r--r-- 3 oracle supergroup 9550699 2011-08-27 17:08 /user/hive/warehouse/test/000022_0.snappy -rw-r--r-- 3 oracle supergroup 9550166 2011-08-27 17:08 /user/hive/warehouse/test/000023_0.snappy -rw-r--r-- 3 oracle supergroup 9546121 2011-08-27 17:08 /user/hive/warehouse/test/000024_0.snappy -rw-r--r-- 3 oracle supergroup 9542885 2011-08-27 17:08 /user/hive/warehouse/test/000025_0.snappy