I actually have the same problem, but I am not sure whether it is a spark problem or a Yarn problem.
I set up a five nodes cluster on aws emr, start yarn daemon on the master (The node manager will not be started on default on the master, I don't want to waste any resource since I have to pay). And submit the spark task through yarn-cluster mode. The command is: ./spark/bin/spark-submit --master yearn-cluster --num-executors 5 --exectutor-cores 4 --propertifies-file spark-application.conf myapp.py But the yarn resource manager only created 4 containers on 4 nodes, and one node was completely on idle. More details about the setup: EMR node: m3.xlarge: 16g ram 4 cores 40g ssd (HDFS on EBS?) Yarn-site.xml: yarn.scheduler.maximum-allocation-mb=11520 yarn.nodemanager.resource.memory-mb=11520 Spark-conf: spark.executor.memory 10g spark.storage.memoryFraction 0.2 spark.python.worker.memory 1500mspark.akka.frameSize 200spark.shuffle.memoryFraction 0.1 spark.driver.memory 10g Hadoop behavior observed: Create 4 containers on four nodes including emr master but one emr slave on idle (memory consumption around 2g and 0% cpu occupation) Spark use one container for driver on emr slave node (make sense since I required that much of memory) Use the other three node for computing the tasks. If yarn can't use all the nodes and I have to pay for the node, it's just a big waste : p Any thoughts on this? Great thanks, Ed 2015-05-18 12:07 GMT-04:00 Sandy Ryza <sandy.r...@cloudera.com>: > *All > > On Mon, May 18, 2015 at 9:07 AM, Sandy Ryza <sandy.r...@cloudera.com> > wrote: > >> Hi Xiaohe, >> >> The all Spark options must go before the jar or they won't take effect. >> >> -Sandy >> >> On Sun, May 17, 2015 at 8:59 AM, xiaohe lan <zombiexco...@gmail.com> >> wrote: >> >>> Sorry, them both are assigned task actually. >>> >>> Aggregated Metrics by Executor >>> Executor IDAddressTask TimeTotal TasksFailed TasksSucceeded TasksInput >>> Size / RecordsShuffle Write Size / RecordsShuffle Spill (Memory)Shuffle >>> Spill (Disk)1host1:61841.7 min505640.0 MB / 12318400382.3 MB / >>> 121007701630.4 >>> MB295.4 MB2host2:620721.7 min505640.0 MB / 12014510386.0 MB / 109269121646.6 >>> MB304.8 MB >>> >>> On Sun, May 17, 2015 at 11:50 PM, xiaohe lan <zombiexco...@gmail.com> >>> wrote: >>> >>>> bash-4.1$ ps aux | grep SparkSubmit >>>> xilan 1704 13.2 1.2 5275520 380244 pts/0 Sl+ 08:39 0:13 >>>> /scratch/xilan/jdk1.8.0_45/bin/java -cp >>>> /scratch/xilan/spark/conf:/scratch/xilan/spark/lib/spark-assembly-1.3.1-hadoop2.4.0.jar:/scratch/xilan/spark/lib/datanucleus-core-3.2.10.jar:/scratch/xilan/spark/lib/datanucleus-api-jdo-3.2.6.jar:/scratch/xilan/spark/lib/datanucleus-rdbms-3.2.9.jar:/scratch/xilan/hadoop/etc/hadoop >>>> -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --master yarn >>>> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp >>>> --num-executors 5 --executor-cores 4 >>>> xilan 1949 0.0 0.0 103292 800 pts/1 S+ 08:40 0:00 grep >>>> --color SparkSubmit >>>> >>>> >>>> When look at the sparkui, I see the following: >>>> Aggregated Metrics by ExecutorExecutor IDAddressTask TimeTotal TasksFailed >>>> TasksSucceeded TasksShuffle Read Size / Records1host1:304836 s101127.1 >>>> MB / 28089782host2:49970 ms00063.4 MB / 1810945 >>>> >>>> So executor 2 is not even assigned a task ? Maybe I have some problems >>>> in my setting, but I don't know what could be the possible settings I set >>>> wrong or have not set. >>>> >>>> >>>> Thanks, >>>> Xiaohe >>>> >>>> On Sun, May 17, 2015 at 11:16 PM, Akhil Das <ak...@sigmoidanalytics.com >>>> > wrote: >>>> >>>>> Did you try --executor-cores param? While you submit the job, do a ps >>>>> aux | grep spark-submit and see the exact command parameters. >>>>> >>>>> Thanks >>>>> Best Regards >>>>> >>>>> On Sat, May 16, 2015 at 12:31 PM, xiaohe lan <zombiexco...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I have a 5 nodes yarn cluster, I used spark-submit to submit a simple >>>>>> app. >>>>>> >>>>>> spark-submit --master yarn >>>>>> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp >>>>>> --num-executors 5 >>>>>> >>>>>> I have set the number of executor to 5, but from sparkui I could see >>>>>> only two executors and it ran very slow. What did I miss ? >>>>>> >>>>>> Thanks, >>>>>> Xiaohe >>>>>> >>>>> >>>>> >>>> >>> >> >