GitHub user CodingCat opened a pull request: https://github.com/apache/spark/pull/63
simplify the implementation of CoarseGrainedSchedulerBackend There are 5 main data structures in the class, after reading the source code, I found that some of them are actually not used, some of them are not necessary This PR also fix a bug. When the executor is removed, the original implementation will add the totalcores with freeCores(executorId)...actually, we should add totalCores with the total number of the cores of the executor * executorAddress â â â It is actually not used, just add when registering executor and remove when executor was removed * executorHost â â â It is added when register the executor and removed when the executor was removed. It is also used for building WorkerOffer * WorkerOffer class â â â the original implementation build a new WorkerOffer object at every ReviveOffer moment, just because of the change of freeCores * freeCores â â â It is used for building WorkerOffer, we construct a WorkerOffer from the values in freeCores for every execution of ReviveOffer My proposal: we can change the cores in WorkerOffer from val to var, keep track the change of that. In this way, we can remove most of data structures and replace them with a HashMap; ``` workerOffers = new HashMap[String, WorkerOffer] ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/CodingCat/spark simplify_CoarseGrainedSchedulerBackend Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/63.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #63 ---- commit 0c0e409d49afa954703462b338af04481b74f563 Author: CodingCat <zhunans...@gmail.com> Date: 2014-03-03T04:22:09Z simplify the implementation of CoarseGrainedSchedulerBackend ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---