GitHub user PraveenSeluka opened a pull request:
https://github.com/apache/spark/pull/2798
[Spark-3822] Ability to add/delete executors from Spark Context. Works in
both yarn-client and yarn-cluster mode
sc.addExecutors(count : Int)
sc.deleteExecutors(List[String])

Diagram : add/deleteExecutors in Yarn-Client mode
In Yarn-Cluster mode, driver program and Yarn allocator threads are in the
same JVM. But, in client mode, they live in different JVMâs. To make this
work in both client and cluster mode, we need a way for the driver and Yarn
Allocator to communicate.
At high level, this works as follows,
AutoscaleClient is created by SparkContext and AutoscaleServer is created
by Yarn Application Master code flow.
Once AutoscaleServerActor comes up, it registers itself to the
AutoscaleClientActor.
SparkContext forwards the add/deleteExecutors call through AutoScaleClient
-> AutoscaleServerActor.
AutoscaleServer has a reference of YarnAllocator. On receiving
addExecutors, it just increments the maxExecutor count in YarnAllocator. And,
reporter thread will allocate the respective containers.
Difference when compared to what is proposed
This creates separate AutoscaleClient whereas the current design
proposed(https://issues.apache.org/jira/secure/attachment/12673181/dynamic-scaling-executors-10-6-14.pdf)
was to add these new messages in CoarseGrainedSchedulerBackend itself. One
minor advantage of doing this way is, the dynamic scaling itself lives as a
separate module in Spark and none of the other Spark Core code is affected
much.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/PraveenSeluka/spark add_delete_executors_hook
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/2798.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2798
----
commit af7a98c17d5c410d8e1c5556465257448e4d630d
Author: Praveen Seluka <[email protected]>
Date: 2014-10-13T09:30:24Z
add/delete executors from Spark Context. Works on Yarn - both in client and
cluster mode
commit c4a883f76c5cc3187adac4cd2e921fa3fe37ded8
Author: Praveen Seluka <[email protected]>
Date: 2014-10-13T14:37:42Z
scalastyle and logging fixes
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]