Re: mapper and reducer scheduling

2010-11-03 Thread Zhenhua Guo
Thanks, Jeff, Harsh, He, Hemanth. Those information is quite helpful! Gerald On Mon, Nov 1, 2010 at 12:01 AM, Hemanth Yamijala wrote: > Hi, > > On Mon, Nov 1, 2010 at 9:13 AM, He Chen wrote: >> If you use the default scheduler of hadoop 0.20.2 or higher. The >> jobQueueScheduler will take the d

Re: mapper and reducer scheduling

2010-10-31 Thread Hemanth Yamijala
Hi, On Mon, Nov 1, 2010 at 9:13 AM, He Chen wrote: > If you use the default scheduler of hadoop 0.20.2 or higher. The > jobQueueScheduler will take the data locality into account. This is true irrespective of the scheduler in use. Other schedulers currently add a layer to decide which job to pic

Re: mapper and reducer scheduling

2010-10-31 Thread He Chen
If you use the default scheduler of hadoop 0.20.2 or higher. The jobQueueScheduler will take the data locality into account. That means when a heart beat from TT arrives, the JT will first check a cache which is a map of node and data-local tasks this node has. The JT will assign node local task f

Re: mapper and reducer scheduling

2010-10-31 Thread Harsh J
Hi, On Mon, Nov 1, 2010 at 8:19 AM, Zhenhua Guo wrote: > Thanks! > One more question. Is the input file replicated on each node where a > mapper is run? Or just the portion processed by a mapper is > transferred? With the use of HDFS, this is what happens: Mappers are run on nodes where the inpu

Re: mapper and reducer scheduling

2010-10-31 Thread Zhenhua Guo
Thanks! One more question. Is the input file replicated on each node where a mapper is run? Or just the portion processed by a mapper is transferred? Gerald On Fri, Oct 29, 2010 at 10:11 AM, Harsh J wrote: > Hello, > > On Fri, Oct 29, 2010 at 12:45 PM, Jeff Zhang wrote: >> TaskTracker will tell

Re: mapper and reducer scheduling

2010-10-29 Thread Harsh J
Hello, On Fri, Oct 29, 2010 at 12:45 PM, Jeff Zhang wrote: > TaskTracker will tell JobTracker how many free slots it has through > heartbeat. And JobTracker will choose the best tasktracker with the > consideration of data locality. Yes. To add some more, a scheduler is responsible to do assignm

Re: mapper and reducer scheduling

2010-10-29 Thread Jeff Zhang
TaskTracker will tell JobTracker how many free slots it has through heartbeat. And JobTracker will choose the best tasktracker with the consideration of data locality. On Thu, Oct 28, 2010 at 2:52 PM, Zhenhua Guo wrote: > Hi, all >  I wonder how Hadoop schedules mappers and reducers (e.g. consid

mapper and reducer scheduling

2010-10-28 Thread Zhenhua Guo
Hi, all I wonder how Hadoop schedules mappers and reducers (e.g. consider load balancing, affinity to data?). For example, how to decide on which nodes mappers and reducers are to be executed and when. Thanks! Gerald