Re: data localisation in spark

2015-06-03 Thread Sandy Ryza
PreferredLocations]] >>>> * from a list of input files or InputFormats for the application. >>>> */ >>>> @DeveloperApi >>>> def this(config: SparkConf, preferredNodeLocationData: Map[String, >>>> Set[SplitInfo]]) = { >>>> this(conf

Re: data localisation in spark

2015-06-02 Thread Shushant Arora
t;> def this(config: SparkConf, preferredNodeLocationData: Map[String, >>> Set[SplitInfo]]) = { >>> this(config) >>> this.preferredNodeLocationData = preferredNodeLocationData >>> } >>> >>> -- >>> bit1...@

Re: data localisation in spark

2015-06-02 Thread Sandy Ryza
{ >> this(config) >> this.preferredNodeLocationData = preferredNodeLocationData >> } >> >> -- >> bit1...@163.com >> >> >> *From:* Shushant Arora >> *Date:* 2015-05-31 22:54 >> *To:* user >> *Subject:* data

Re: data localisation in spark

2015-06-02 Thread Shushant Arora
gt; this.preferredNodeLocationData = preferredNodeLocationData > } > > -- > bit1...@163.com > > > *From:* Shushant Arora > *Date:* 2015-05-31 22:54 > *To:* user > *Subject:* data localisation in spark > > I want to understand how s

Re: data localisation in spark

2015-05-31 Thread Sandy Ryza
Hi Shushant, Spark currently makes no effort to request executors based on data locality (although it does try to schedule tasks within executors based on data locality). We're working on adding this capability at SPARK-4352 . -Sandy On Sun, May

data localisation in spark

2015-05-31 Thread Shushant Arora
I want to understand how spark takes care of data localisation in cluster mode when run on YARN. 1.Driver program asks ResourceManager for executors. Does it tell yarn's RM to check HDFS blocks of input data and then allocate executors to it. And executors remain fixed throughout application or d