I had a type in last email. Settings are as follows hive> set mapred.min.split.size.per.node=1000000000; hive> set mapred.min.split.size.per.rack=1000000000; hive> set mapred.max.split.size=1000000000; hive> set hive.merge.size.per.task=1000000000; hive> set hive.merge.smallfiles.avgsize=1000000000; hive> set hive.merge.size.smallfiles.avgsize=1000000000;*hive> set hive.merge.mapfiles=true;*hive> set hive.merge.mapredfiles=true;
*hive> set hive.mergejob.maponly=false;* On Mon, Mar 12, 2012 at 4:27 PM, Shrijeet Paliwal <shrij...@rocketfuel.com>wrote: > Hive Version: Hive 0.8 (last commit SHA > b581a6192b8d4c544092679d05f45b2e50d42b45 ) > > Hadoop version : chd3u0 > > I am trying to use the hive merge small file feature by setting all the > necessary params. > I am disabling use of CombineHiveInputFormat since my input is compressed > text. > > hive> set mapred.min.split.size.per.node=1000000000; > hive> set mapred.min.split.size.per.rack=1000000000; > hive> set mapred.max.split.size=1000000000; > hive> set hive.merge.size.per.task=1000000000; > hive> set hive.merge.smallfiles.avgsize=1000000000; > hive> set hive.merge.size.smallfiles.avgsize=1000000000; > hive> set hive.merge.mapfiles=false; > hive> set hive.merge.mapredfiles=true; > > > The plan decides to launch two MR jobs but after first job succeeds I get > runt time error > > "java.lang.RuntimeException: Plan invalid, Reason: Reducers == 0 but > reduce operator specified" > > I think the problem can be fixed by using this patch I came with : > https://gist.github.com/2025303 > > Of course my understanding and hence this patch can be totally wrong. > Please provide feedback. >