Re: Launching an Spark application in a subset of machines

2017-02-07 Thread Michael Gummelt
> Looking into Mesos attributes this seems the perfect fit for it. Is that correct? Yes. On Tue, Feb 7, 2017 at 3:43 AM, Muhammad Asif Abbasi wrote: > YARN provides the concept of node labels. You should explore the > "spark.yarn.executor.nodeLabelConfiguration" property. > > > Cheers, > Asif A

Re: Launching an Spark application in a subset of machines

2017-02-07 Thread Muhammad Asif Abbasi
YARN provides the concept of node labels. You should explore the "spark.yarn.executor.nodeLabelConfiguration" property. Cheers, Asif Abbasi On Tue, 7 Feb 2017 at 10:21, Alvaro Brandon wrote: > Hello all: > > I have the following scenario. > - I have a cluster of 50 machines with Hadoop and Spa

Re: Launching an Spark application in a subset of machines

2017-02-07 Thread Alvaro Brandon
I want to scale up or down the number of machines used, depending on the SLA of a job. For example if I have a low priority job I will give it 10 machines, while a high priority will be given 50. Also I want to choose subsets depending on the hardware. For example "Launch this job only on machines

Re: Launching an Spark application in a subset of machines

2017-02-07 Thread Jörn Franke
If you want to run them always on the same machines use yarn node labels. If it is any 10 machines then use capacity or fair scheduler. What is the use case for running it always on the same 10 machines. If it is for licensing reasons then I would ask your vendor if this is a suitable mean to e

Re: Launching an Spark application in a subset of machines

2017-02-07 Thread Alvaro Brandon
Hello Pavel: Thanks for the pointers. For standalone cluster manager: I understand that I just have to start several masters with a subset of slaves attached. Then each master will listen on a different pair of , allowing me to spark-submit to any of these pairs depending on the subset of machine

Re: Launching an Spark application in a subset of machines

2017-02-07 Thread Pavel Plotnikov
Hi, Alvaro You can create different clusters using standalone cluster manager, and than manage subset of machines through submitting application on different masters. Or you can use Mesos attributes to mark subset of workers and specify it in spark.mesos.constraints On Tue, Feb 7, 2017 at 1:21 PM

Launching an Spark application in a subset of machines

2017-02-07 Thread Alvaro Brandon
Hello all: I have the following scenario. - I have a cluster of 50 machines with Hadoop and Spark installed on them. - I want to launch one Spark application through spark submit. However I want this application to run on only a subset of these machines, disregarding data locality. (e.g. 10 machin