Re: spark job scheduling

2016-01-27 Thread Jakob Odersky
Nitpick: the up-to-date version of said wiki page is https://spark.apache.org/docs/1.6.0/job-scheduling.html (not sure how much it changed though) On Wed, Jan 27, 2016 at 7:50 PM, Chayapan Khannabha wrote: > I would start at this wiki page > https://spark.apache.org/docs/1.2.0/job-scheduling.html

Re: spark job scheduling

2016-01-27 Thread Chayapan Khannabha
I would start at this wiki page https://spark.apache.org/docs/1.2.0/job-scheduling.html Although I'm sure this depends a lot on your cluster environment and the deployed Spark version. IMHO On Thu, Jan 28, 2016 at 10:27 AM, Niranda Perera wrote: > Sorry I have made typos. let me rephrase > > 1

Re: spark job scheduling

2016-01-27 Thread Niranda Perera
Sorry I have made typos. let me rephrase 1. As I understand, the smallest unit of work an executor can perform, is a 'task'. In the 'FAIR' scheduler mode, let's say a job is submitted to the spark ctx which has a considerable amount of work to do in a single task. While such a 'big' task is runnin

Re: spark job scheduling

2016-01-27 Thread Chayapan Khannabha
I think the smallest unit of work is a "Task", and an "Executor" is responsible for getting the work done? Would like to understand more about the scheduling system too. Scheduling strategy like FAIR or FIFO do have significant impact on a Spark cluster architecture design decision. Best, Chayapa