The MapReduce application master reads the split info from HDFS and then submits requests to the scheduler based on the locations there.
On Thu, Apr 3, 2014 at 1:22 PM, Brad Childs <b...@redhat.com> wrote: > Sandy/Shekhar Thank you very much for the helpful responses. > > One last question/clarification- the getFileBlockLocations(..) in the > FileSystem API is the only file-->node mapping that I'm aware of, and it > seems the only place its called is in the map/reduce client > (FileInputFormat, MultiFileSplit). > > Is this accurate: > > 1) Jobclient calls getFileBlockLocations(..) and serializes this info to a > file on the DFS. > 2) Scheduler reads this information for determining locality > > Or am I missing another call/method for determining node location for > block locality? I haven't located the spot in source that reads the block > location file for scheduler is why i ask. > > -bc > > > ----- Original Message ----- > > From: "Sandy Ryza" <sandy.r...@cloudera.com> > > To: common-dev@hadoop.apache.org > > Sent: Thursday, April 3, 2014 2:38:42 PM > > Subject: Re: Yarn / mapreduce scheduling > > > > The equivalent code in the Fair Scheduler is in AppSchedulable.java, > > under assignContainer(FSSchedulerNode node, boolean reserved). > > > > YARN uses delay scheduling ( > > > http://people.csail.mit.edu/matei/papers/2010/eurosys_delay_scheduling.pdf > ) > > for achieving data-locality. > > > > -Sandy > > > > > > On Thu, Apr 3, 2014 at 11:16 AM, Shekhar Gupta <shkhr...@gmail.com> > wrote: > > > > > Hi Brad, > > > > > > YARN scheduling does take care of data locality. In YARN, tasks > are not > > > assigned based on capacity. Actually certain number of containers are > > > allocated on every node based on node's capacity. Tasks are executed on > > > those containers. While scheduling tasks on containers YARN scheduler > > > satisfies data locality requirements. I am not very familiar with Fair > > > Scheduler but if you check the source of FifoScheduler you will find a > > > function 'assignContainersonNode' which looks like following > > > > > > private int assignContainersOnNode(FiCaSchedulerNode node, > > > FiCaSchedulerApp application, Priority priority > > > ) { > > > // Data-local > > > int nodeLocalContainers = > > > assignNodeLocalContainers(node, application, priority); > > > > > > // Rack-local > > > int rackLocalContainers = > > > assignRackLocalContainers(node, application, priority); > > > > > > // Off-switch > > > int offSwitchContainers = > > > assignOffSwitchContainers(node, application, priority); > > > > > > > > > LOG.debug("assignContainersOnNode:" + > > > " node=" + node.getRMNode().getNodeAddress() + > > > " application=" + application.getApplicationId().getId() + > > > " priority=" + priority.getPriority() + > > > " #assigned=" + > > > (nodeLocalContainers + rackLocalContainers + > offSwitchContainers)); > > > > > > > > > return (nodeLocalContainers + rackLocalContainers + > > > offSwitchContainers); > > > } > > > > > > In this routine you will find that data-local tasks are scheduled > first, > > > then rack-local and in then off-switch. > > > > > > After this you may find similar function in fairScheduler too. > > > > > > I hope this helps. Let me know if you more questions or if something is > > > wrong in my reasoning. > > > > > > Regards, > > > Shekhar > > > > > > > > > On Thu, Apr 3, 2014 at 10:56 AM, Brad Childs <b...@redhat.com> wrote: > > > > > > > Sorry if this is the wrong list, i am looking for deep > technical/hadoop > > > > source help :) > > > > > > > > How does job scheduling work on yarn framework for map reduce jobs? > I > > > see > > > > the yarn scheduler discussed here: > > > > > > > > https://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/YARN.htmlwhich > > > leads me to believe tasks are scheduled based on node capacity and > > > > not data locality. I've sifted through the fair scheduler and can't > find > > > > anything about data location or locality. > > > > > > > > Where does data locality play into the scheduling of map/reduce > tasks on > > > > yarn? Can someone point me to the hadoop 2.x source where the data > block > > > > location is used to calculate node/container/task assignment (if > thats > > > > still happening). > > > > > > > > > > > > > > > > -bc > > > > > > > > > > > > > >