I would start at this wiki page https://spark.apache.org/docs/1.2.0/job-scheduling.html
Although I'm sure this depends a lot on your cluster environment and the deployed Spark version. IMHO On Thu, Jan 28, 2016 at 10:27 AM, Niranda Perera <niranda.per...@gmail.com> wrote: > Sorry I have made typos. let me rephrase > > 1. As I understand, the smallest unit of work an executor can perform, is > a 'task'. In the 'FAIR' scheduler mode, let's say a job is submitted to the > spark ctx which has a considerable amount of work to do in a single task. > While such a 'big' task is running, can we still submit another smaller job > (from a separate thread) and get it done? or does that smaller job has to > wait till the bigger task finishes and the resources are freed from the > executor? > (essentially, what I'm asking is, in the FAIR scheduler mode, jobs are > scheduled fairly, but at the task granularity they are still FIFO?) > > 2. When a job is submitted without setting a scheduler pool, the 'default' > scheduler pool is assigned to it, which employs FIFO scheduling. but what > happens when we have the spark.scheduler.mode as FAIR, and if I submit jobs > without specifying a scheduler pool (which has FAIR scheduling)? would the > jobs still run in FIFO mode with the default pool? > essentially, for us to really set FAIR scheduling, do we have to assign a > FAIR scheduler pool also to the job? > > On Thu, Jan 28, 2016 at 8:47 AM, Chayapan Khannabha <chaya...@gmail.com> > wrote: > >> I think the smallest unit of work is a "Task", and an "Executor" is >> responsible for getting the work done? Would like to understand more about >> the scheduling system too. Scheduling strategy like FAIR or FIFO do have >> significant impact on a Spark cluster architecture design decision. >> >> Best, >> >> Chayapan (A) >> >> On Thu, Jan 28, 2016 at 10:07 AM, Niranda Perera < >> niranda.per...@gmail.com> wrote: >> >>> hi all, >>> >>> I have a few questions on spark job scheduling. >>> >>> 1. As I understand, the smallest unit of work an executor can perform. >>> In the 'fair' scheduler mode, let's say a job is submitted to the spark >>> ctx which has a considerable amount of work to do in a task. While such a >>> 'big' task is running, can we still submit another smaller job (from a >>> separate thread) and get it done? or does that smaller job has to wait till >>> the bigger task finishes and the resources are freed from the executor? >>> >>> 2. When a job is submitted without setting a scheduler pool, the default >>> scheduler pool is assigned to it, which employs FIFO scheduling. but what >>> happens when we have the spark.scheduler.mode as FAIR, and if I submit jobs >>> without specifying a scheduler pool (which has FAIR scheduling)? would the >>> jobs still run in FIFO mode with the default pool? >>> essentially, for us to really set FAIR scheduling, do we have to assign >>> a FAIR scheduler pool? >>> >>> best >>> >>> -- >>> Niranda >>> @n1r44 <https://twitter.com/N1R44> >>> +94-71-554-8430 >>> https://pythagoreanscript.wordpress.com/ >>> >> >> > > > -- > Niranda > @n1r44 <https://twitter.com/N1R44> > +94-71-554-8430 > https://pythagoreanscript.wordpress.com/ >