Ah, that's why all the stuff about scheduler pools is under the section "Scheduling Within an Application <https://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application>". 😅 I am so used to talking to my coworkers about jobs in sense of applications that I forgot your typical Spark application submits multiple "jobs", each of which has multiple stages, etc.
So in my case I need to read up more closely about YARN queues <https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html> since I want to share resources *across* applications. Thanks Mark! On Wed, Apr 5, 2017 at 4:31 PM Mark Hamstra <m...@clearstorydata.com> wrote: > `spark-submit` creates a new Application that will need to get resources > from YARN. Spark's scheduler pools will determine how those resources are > allocated among whatever Jobs run within the new Application. > > Spark's scheduler pools are only relevant when you are submitting multiple > Jobs within a single Application (i.e., you are using the same SparkContext > to launch multiple Jobs) and you have used SparkContext#setLocalProperty to > set "spark.scheduler.pool" to something other than the default pool before > a particular Job intended to use that pool is started via that SparkContext. > > On Wed, Apr 5, 2017 at 1:11 PM, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > > Hmm, so when I submit an application with `spark-submit`, I need to > guarantee it resources using YARN queues and not Spark's scheduler pools. > Is that correct? > > When are Spark's scheduler pools relevant/useful in this context? > > On Wed, Apr 5, 2017 at 3:54 PM Mark Hamstra <m...@clearstorydata.com> > wrote: > > grrr... s/your/you're/ > > On Wed, Apr 5, 2017 at 12:54 PM, Mark Hamstra <m...@clearstorydata.com> > wrote: > > Your mixing up different levels of scheduling. Spark's fair scheduler > pools are about scheduling Jobs, not Applications; whereas YARN queues with > Spark are about scheduling Applications, not Jobs. > > On Wed, Apr 5, 2017 at 12:27 PM, Nick Chammas <nicholas.cham...@gmail.com> > wrote: > > I'm having trouble understanding the difference between Spark fair > scheduler pools > <https://spark.apache.org/docs/latest/job-scheduling.html#fair-scheduler-pools> > and YARN queues > <https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html>. > Do they conflict? Does one override the other? > > I posted a more detailed question about an issue I'm having with this on > Stack Overflow: http://stackoverflow.com/q/43239921/877069 > > Nick > > > ------------------------------ > View this message in context: Spark fair scheduler pools vs. YARN queues > <http://apache-spark-user-list.1001560.n3.nabble.com/Spark-fair-scheduler-pools-vs-YARN-queues-tp28572.html> > Sent from the Apache Spark User List mailing list archive > <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. > > > > >