Hi all, I have a few questions on how Spark is integrated with Mesos - any details, or pointers to a design document / relevant source, will be much appreciated.
I'm aware of this description, https://github.com/apache/spark/blob/master/docs/running-on-mesos.md But its pretty high-level as far as the design is concerned, while I'm looking into lower details on how Spark actually calls the Mesos APIs, how it launches the tasks, etc Namely, 1. Does Spark creates a Mesos Framework instance for each Spark application (SparkContext)? 2. Citing from the link above, "In "fine-grained" mode (default), each Spark task runs as a separate Mesos task ... comes with an additional overhead in launching each task " Does it mean that the Mesos slave launches a Spark Executor for each task? (unlikely..) Or the slave host has a number of Spark Executors pre-launched (one per application), and sends the task to its application's executor? What is the resource offer then? Is it a host's cpu slice offered to any Framework (Spark app/context), that sends the task to run on it? Or its a 'slice of app Executor' that got idle, and is offered to its Framework? 3. "The "coarse-grained" mode will instead launch only one long-running Spark task on each Mesos machine, and dynamically schedule its own "mini-tasks" within it. " What is this special task? Is it the Spark app Executor? How these mini-tasks are different from 'regular' Spark tasks? How the resources are allocated/offered in this mode? Regards, Gidon