Hello
We're running some experiments with Spark (v1.4) and have some questions about its scheduling behavior. I am hoping someone can answer the following questions. What is a task set? It is mentioned in the Spark logs we get from our runs but we can't seem to find a definition and how it relates to the Spark concepts of Jobs, Stages, and Tasks in the online documentation. This makes it hard to reason about the scheduling behavior. What is the heuristic used to kill executors when running Spark with YARN in dynamic mode? From the logs what we observe is that executors that have work (task sets) queued to them are being killed and the work (task sets) are being reassigned to other executors. This seems inconsistent with the online documentation which says that executors aren't killed until they've been idle for a user configurable number of seconds. We're using the Fair scheduler pooling with multiple pools each with different weights, so is it correct that there are queues in the pools and in the executors as well? We can provide more details on our setup if desired. Regards, Rob Saccone IBM T. J. Watson Center