And your queries were?
On Mon, Oct 15, 2012 at 8:09 PM, Saurabh Mishra <saurabhmishra.i...@outlook.com> wrote: > Hi, > I am firing some hive queries joining tables containing upto 30millions > records each. Since the load on the reducers is very significant in these > cases, i specifically set the following parameters before executing the > queries : > > set mapred.reduce.tasks=100; > set hive.exec.reducers.bytes.per.reducer=500000000; > set hive.optimize.cp=true; > > The number of reducer the job spouts in now 160, but despite the high number > most of the load remains upon 1 or 2 reducers. Hence in the final > statistics, 158 reducers go completed with 2-3 minutes of start and 2 > reducers took 2 hrs to run. > Is there any way to overcome this load distribution disparity. > Any help in this regards will be highly appreciated. > > Sincerely > Saurabh Mishra