Hemanth,
Thanks!! Saurabh Agarwal On Fri, May 14, 2010 at 9:49 AM, Hemanth Yamijala <yhema...@gmail.com>wrote: > Saurabh, > > > let me re frame my question I wanted to knowhow job tracker decides the > > assignment of input splits to task tracker based on task tracker's data > > locality. Where is this policy defined? Is it pluggable? > > Sorry, I misunderstood your question then. This code is in > o.a.h.mapred.JobInProgress. It is likely spread across many methods in > the class. But a good starting point could be from methods like > obtainNewMapTask or obtainNewReduceTask. > > At the moment, this policy is not pluggable. But I know there have > been discussions (possibly even a JIRA, though I can't locate any now) > asking for this capability. > > Thanks > Hemanth > > > > > On Fri, May 14, 2010 at 7:04 AM, Hemanth Yamijala <yhema...@gmail.com > >wrote: > > > >> Saurabh, > >> > >> > i am experimenting with hadoop. wanted to ask that is the Task > >> distribution > >> > policy by job tracker pluggable if yes where in the code tree is it > >> defined. > >> > > >> > >> Take a look at o.a.h.mapred.TaskScheduler. That's the abstract class > >> that needs to be extended to define a new scheduling policy. Also, > >> please do take a look at the existing schedulers that extend this > >> class. There are 3-4 implementations including the default scheduler, > >> capacity scheduler, fairshare scheduler and dynamic priority > >> scheduler. It may be worthwhile to see if your ideas match any of the > >> existing implementations to some degree and then consider enhancing > >> those as a first option. > >> > >> Thanks > >> Hemanth > >> > > >