GitHub user CodingCat opened a pull request:

    https://github.com/apache/spark/pull/63

    simplify the implementation of CoarseGrainedSchedulerBackend

    There are 5 main data structures in the class, after reading the source 
code, I found that some of them are actually not used, some of them are not 
necessary
    
    This PR also fix a bug. When the executor is removed, the original 
implementation will add the totalcores with freeCores(executorId)...actually, 
we should add totalCores with the total number of the cores of the executor
    
    * executorAddress
    
    ⋅⋅⋅It is actually not used, just add when registering executor and 
remove when executor was removed
    
    * executorHost
    
    ⋅⋅⋅ It is added when register the executor and removed when the 
executor was removed. It is also used for building WorkerOffer
    
    * WorkerOffer class
    
    ⋅⋅⋅ the original implementation build a new WorkerOffer object at 
every ReviveOffer moment, just because of the change of freeCores
    
    * freeCores
    
    ⋅⋅⋅ It is used for building WorkerOffer, we construct a WorkerOffer 
from the values in freeCores for every execution of ReviveOffer
    
    My proposal:
    
    we can change the cores in WorkerOffer from val to var, keep track the 
change of that. In this way, we can remove most of data structures and replace 
them with a HashMap; 
    
    ```
    workerOffers = new HashMap[String, WorkerOffer]
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/CodingCat/spark 
simplify_CoarseGrainedSchedulerBackend

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/63.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #63
    
----
commit 0c0e409d49afa954703462b338af04481b74f563
Author: CodingCat <zhunans...@gmail.com>
Date:   2014-03-03T04:22:09Z

    simplify the implementation of CoarseGrainedSchedulerBackend

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to