Top-K Optimization

2012-11-19 Thread Sivaramakrishnan Narayanan
change: - Determining when the Top-K optimization is applicable and setting K in ReduceSinkDesc - Passing the K value along to MapredWork - ExecDriver sets map.sort.limitrecords before executing the job corresponding to the MapredWork This change will reduce the amount

Re: Top-K optimization

2012-11-19 Thread Namit Jain
, map-task stops after >map.sort.limitrecords records for each reducer > - Effectively, each mapper sends out its top-K records > >Hive change: > - Determining when the Top-K optimization is applicable and setting K in >ReduceSinkDesc > - Passing the K value a

Top-K optimization

2012-11-19 Thread Sivaramakrishnan Narayanan
change: - Determining when the Top-K optimization is applicable and setting K in ReduceSinkDesc - Passing the K value along to MapredWork - ExecDriver sets map.sort.limitrecords before executing the job corresponding to the MapredWork This change will reduce the amount