[GitHub] spark pull request: [Spark-3822] Ability to add/delete executors f...

PraveenSeluka Tue, 14 Oct 2014 04:23:10 -0700

GitHub user PraveenSeluka opened a pull request:

    https://github.com/apache/spark/pull/2798


    [Spark-3822] Ability to add/delete executors from Spark Context. Works in 
both yarn-client and yarn-cluster mode

    sc.addExecutors(count : Int)
    sc.deleteExecutors(List[String])
    
![image](https://cloud.githubusercontent.com/assets/5339198/4628165/8f281bd8-5393-11e4-8f28-f5cc3d7b0795.png)
    Diagram : add/deleteExecutors in Yarn-Client mode
    
    In Yarn-Cluster mode, driver program and Yarn allocator threads are in the 
same JVM. But, in client mode, they live in different JVMâs. To make this 
work in both client and cluster mode,  we need a way for the driver and Yarn 
Allocator to communicate. 
    
    At high level, this works as follows, 
    AutoscaleClient is created by SparkContext and AutoscaleServer is created 
by Yarn Application Master code flow. 
    Once AutoscaleServerActor comes up, it registers itself to the 
AutoscaleClientActor.
    SparkContext forwards the add/deleteExecutors call through AutoScaleClient 
-> AutoscaleServerActor.
    AutoscaleServer has a reference of YarnAllocator. On receiving 
addExecutors, it just increments the maxExecutor count in YarnAllocator. And, 
reporter thread will allocate the respective containers.
    
    Difference when compared to what is proposed
    This creates separate AutoscaleClient whereas the current design 
proposed(https://issues.apache.org/jira/secure/attachment/12673181/dynamic-scaling-executors-10-6-14.pdf)
 was to add these new messages in CoarseGrainedSchedulerBackend itself. One 
minor advantage of doing this way is, the dynamic scaling itself lives as a 
separate module in Spark and none of the other Spark Core code is affected 
much. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/PraveenSeluka/spark add_delete_executors_hook

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2798.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2798
    
----
commit af7a98c17d5c410d8e1c5556465257448e4d630d
Author: Praveen Seluka <[email protected]>
Date:   2014-10-13T09:30:24Z

    add/delete executors from Spark Context. Works on Yarn - both in client and 
cluster mode

commit c4a883f76c5cc3187adac4cd2e921fa3fe37ded8
Author: Praveen Seluka <[email protected]>
Date:   2014-10-13T14:37:42Z

    scalastyle and logging fixes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [Spark-3822] Ability to add/delete executors f...

Reply via email to