Github user sryza commented on the pull request:

    https://github.com/apache/spark/pull/2746#issuecomment-59863719
  
    > By "timer" I was referring to the logical timer, not java.util.Timer as 
an implementation detail.
    
    @andrewor14 , my bad, should have looked closer at the patch.
    
    @kayousterhout ,  @andrewor14 is exactly right.  MR/Tez containers are 
shorter lived, which gives elasticity at the expense of needing to suffer JVM 
startup times more often.  My point wasn't that "because they were able to pull 
it off, Spark should be able to pull it off" as much as "these frameworks have 
an ease-of-use advantage over Spark if we don't address this problem, and 
adding some complexity in Spark is worth closing that gap".  I think ease of 
use is more important than ease of understanding.  The main complaint I get 
with Spark is that it's difficult out of the box.
    
    With regards to ease of understanding, my opinion is that Kay's policy 
actually seems a little more straightforward.  I imagine that when a user needs 
to care about how this stuff works, the question they're probably asking: "as I 
watch this stage run and (go slow|take too many resources), is the number of 
executors I have reasonable?"  An answer based on the number of pending tasks 
is easier to grasp than an answer that changes through time and is based on 
exponentiation.  I.e. yes, the time/exponent rule is simple, but interpreting 
its behavior is less so.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to