Thanks! One more question. Is the input file replicated on each node where a mapper is run? Or just the portion processed by a mapper is transferred?
Gerald On Fri, Oct 29, 2010 at 10:11 AM, Harsh J <qwertyman...@gmail.com> wrote: > Hello, > > On Fri, Oct 29, 2010 at 12:45 PM, Jeff Zhang <zjf...@gmail.com> wrote: >> TaskTracker will tell JobTracker how many free slots it has through >> heartbeat. And JobTracker will choose the best tasktracker with the >> consideration of data locality. > > Yes. To add some more, a scheduler is responsible to do assignments of > tasks (based on various stats, including data locality) to proper > tasktrackers. Scheduler.assignTasks(TaskTracker) is used to assign a > TaskTracker its tasks, and the scheduler type is configurable (Some > examples are Eager/FIFO scheduler, Capacity scheduler, etc.). > > This scheduling is done when a heart beat response is to be sent back > to a TaskTracker that called JobTracker.heartbeat(...). > >> >> >> On Thu, Oct 28, 2010 at 2:52 PM, Zhenhua Guo <jen...@gmail.com> wrote: >>> Hi, all >>> I wonder how Hadoop schedules mappers and reducers (e.g. consider >>> load balancing, affinity to data?). For example, how to decide on >>> which nodes mappers and reducers are to be executed and when. >>> Thanks! >>> >>> Gerald >>> >> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> > > > > -- > Harsh J > www.harshj.com >