Re: Fw: How Spark Choose Worker Nodes for respective HDFS block

2014-07-01 Thread Chris Fregly
yes, spark attempts to achieve data locality (PROCESS_LOCAL or NODE_LOCAL) where possible just like MapReduce. it's a best practice to co-locate your Spark Workers on the same nodes as your HDFS Name Nodes for just this reason. this is achieved through the RDD.preferredLocations() interface metho

Fw: How Spark Choose Worker Nodes for respective HDFS block

2014-06-13 Thread anishs...@yahoo.co.in
Hi All Is there any communication between Spark MASTER node and Hadoop NameNode while distributing work to WORKER nodes, like we have in MapReduce. Please suggest TIA --  Anish Sneh "Experience is the best teacher." http://in.linkedin.com/in/anishsneh