Hi Tom, I believe a workaround is to set `spark.dynamicAllocation.initialExecutors` to 0. As others have mentioned, from Spark 1.5.2 onwards this should no longer be necessary.
-Andrew 2015-11-09 8:19 GMT-08:00 Jonathan Kelly <jonathaka...@gmail.com>: > Tom, > > You might be hitting https://issues.apache.org/jira/browse/SPARK-10790, > which was introduced in Spark 1.5.0 and fixed in 1.5.2. Spark 1.5.2 just > passed release candidate voting, so it should be tagged, released and > announced soon. If you are able to build from source yourself and run with > that, you might want to try building from the v1.5.2-rc2 tag to see if it > fixes your issue. Otherwise, hopefully Spark 1.5.2 will be available for > download very soon. > > ~ Jonathan > > On Mon, Nov 9, 2015 at 6:08 AM, Akhil Das <ak...@sigmoidanalytics.com> > wrote: > >> Did you go through >> http://spark.apache.org/docs/latest/job-scheduling.html#configuration-and-setup >> for yarn, i guess you will have to copy the spark-1.5.1-yarn-shuffle.jar to >> the classpath of all nodemanagers in your cluster. >> >> Thanks >> Best Regards >> >> On Fri, Oct 30, 2015 at 7:41 PM, Tom Stewart < >> stewartthom...@yahoo.com.invalid> wrote: >> >>> I am running the following command on a Hadoop cluster to launch Spark >>> shell with DRA: >>> spark-shell --conf spark.dynamicAllocation.enabled=true --conf >>> spark.shuffle.service.enabled=true --conf >>> spark.dynamicAllocation.minExecutors=4 --conf >>> spark.dynamicAllocation.maxExecutors=12 --conf >>> spark.dynamicAllocation.sustainedSchedulerBacklogTimeout=120 --conf >>> spark.dynamicAllocation.schedulerBacklogTimeout=300 --conf >>> spark.dynamicAllocation.executorIdleTimeout=60 --executor-memory 512m >>> --master yarn-client --queue default >>> >>> This is the code I'm running within the Spark Shell - just demo stuff >>> from teh web site. >>> >>> import org.apache.spark.mllib.clustering.KMeans >>> import org.apache.spark.mllib.linalg.Vectors >>> >>> // Load and parse the data >>> val data = sc.textFile("hdfs://ns/public/sample/kmeans_data.txt") >>> >>> val parsedData = data.map(s => Vectors.dense(s.split(' >>> ').map(_.toDouble))).cache() >>> >>> // Cluster the data into two classes using KMeans >>> val numClusters = 2 >>> val numIterations = 20 >>> val clusters = KMeans.train(parsedData, numClusters, numIterations) >>> >>> This works fine on Spark 1.4.1 but is failing on Spark 1.5.1. Did >>> something change that I need to do differently for DRA on 1.5.1? >>> >>> This is the error I am getting: >>> 15/10/29 21:44:19 WARN YarnScheduler: Initial job has not accepted any >>> resources; check your cluster UI to ensure that workers are registered and >>> have sufficient resources >>> 15/10/29 21:44:34 WARN YarnScheduler: Initial job has not accepted any >>> resources; check your cluster UI to ensure that workers are registered and >>> have sufficient resources >>> 15/10/29 21:44:49 WARN YarnScheduler: Initial job has not accepted any >>> resources; check your cluster UI to ensure that workers are registered and >>> have sufficient resources >>> >>> That happens to be the same error you get if you haven't followed the >>> steps to enable DRA, however I have done those and as I said if I just flip >>> to Spark 1.4.1 on the same cluster it works with my YARN config. >>> >>> >> >