Hi,

On Mon, Nov 1, 2010 at 8:19 AM, Zhenhua Guo <jen...@gmail.com> wrote:
> Thanks!
> One more question. Is the input file replicated on each node where a
> mapper is run? Or just the portion processed by a mapper is
> transferred?

With the use of HDFS, this is what happens: Mappers are run on nodes
where the input file's blocks are already present [Data-local map
tasks]. If TaskTracker slots are unavailable on that node for the
mapper to run, it is run somewhere else and the input block ("portion
processed by a mapper") is fetched from one of the DataNodes in the
same rack [Rack-local map tasks].

-- 
Harsh J
www.harshj.com

Reply via email to