Re: RDD: Execution and Scheduling

2015-09-22 Thread gsvic
I already have but I needed some clarifications. Thanks for all your help! -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/RDD-Execution-and-Scheduling-tp14177p14286.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. -

Re: RDD: Execution and Scheduling

2015-09-20 Thread Reynold Xin
On Sun, Sep 20, 2015 at 3:58 PM, gsvic wrote: > Concerning answers 1 and 2: > > 1) How Spark determines a node as a "slow node" and how slow is that? > There are two cases here: 1. If a node is busy (e.g. all slots are already occupied), the scheduler cannot schedule anything on it. See "Delay

Re: RDD: Execution and Scheduling

2015-09-20 Thread gsvic
Concerning answers 1 and 2: 1) How Spark determines a node as a "slow node" and how slow is that? 2) How an RDD chooses a location as a preferred location and with which criteria? Could you please also include the links of the source files for the two questions above? -- View this message

Re: RDD: Execution and Scheduling

2015-09-17 Thread gsvic
Concerning answers 1 and 2: 1) How Spark determines a node as a "slow node" and how slow is that? 2) How an RDD choose a location as a preferred location and with which criteria? Could you please also include the links of the source files for the two questions above? -- View this message in c

Re: RDD: Execution and Scheduling

2015-09-17 Thread Reynold Xin
Your understanding is mostly correct. Replies inline. On Thu, Sep 17, 2015 at 5:23 AM, gsvic wrote: > After reading some parts of Spark source code I would like to make some > questions about RDD execution and scheduling. > > At first, please correct me if I am wrong at the following: > 1) The n