Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/2746#issuecomment-60470225
Hi all, I have implemented the relevant changes in #2840. The interfaces
there are very similar to what we have discussed. A major difference is that
the scheduler backend specifies the total number of executors desired for the
application rather than the pending number of executors. This takes care of the
race condition that @sryza pointed out above. In practice, I have observed this
race condition somewhat frequently before these changes went in.
(Note that we can't rely on `YarnAllocator` to handle the over-allocation
scenarios as @vanzin suggested. This is because by the time the AM gets a
pending request from the driver, it has no way of knowing whether the request
is sent before or after any prior requests have been fulfilled.)
I intend to merge #2840 before this patch, so I would really appreciate it
if all of you could review that one first. I have tested that one quite
rigorously and I no longer observe any of the race conditions we used to see
with the old semantics.
@sryza @vanzin @pwendell @kayousterhout
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]