Hi, I’m new to Spark and trying to test first Spark prog. I’m running SparkPi successfully in yarn-client -mode but when running the same in yarn-mode, app gets stuck to ACCEPTED phase. I’ve tried hours to hunt down the reason but the outcome is always the same. Any hints what to look for next?
cheers, -jan vagrant@vm-cluster-node1:~$ ./run_pi.sh 14/05/20 06:24:04 INFO RMProxy: Connecting to ResourceManager at vm-cluster-node2/10.211.55.101:8032 14/05/20 06:24:05 INFO Client: Got Cluster metric info from ApplicationsManager (ASM), number of NodeManagers: 2 14/05/20 06:24:05 INFO Client: Queue info ... queueName: root.default, queueCurrentCapacity: 0.0, queueMaxCapacity: -1.0, queueApplicationCount = 0, queueChildQueueCount = 0 14/05/20 06:24:05 INFO Client: Max mem capabililty of a single resource in this cluster 2048 14/05/20 06:24:05 INFO Client: Preparing Local resources 14/05/20 06:24:05 INFO Client: Uploading file:/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/spark/assembly/lib/spark-assembly_2.10-0.9.0-cdh5.0.0-hadoop2.3.0-cdh5.0.0.jar to hdfs://vm-cluster-node2:8020/user/vagrant/.sparkStaging/application_1400563733088_0012/spark-assembly_2.10-0.9.0-cdh5.0.0-hadoop2.3.0-cdh5.0.0.jar 14/05/20 06:24:07 INFO Client: Setting up the launch environment 14/05/20 06:24:07 INFO Client: Setting up container launch context 14/05/20 06:24:07 INFO Client: Command for starting the Spark ApplicationMaster: java -server -Xmx1024m -Djava.io.tmpdir=$PWD/tmp org.apache.spark.deploy.yarn.ApplicationMaster --class org.apache.spark.examples.SparkPi --jar /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/spark/assembly/lib/spark-assembly_2.10-0.9.0-cdh5.0.0-hadoop2.3.0-cdh5.0.0.jar --args 'yarn-standalone' --args '10' --worker-memory 500 --worker-cores 1 --num-workers 1 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr 14/05/20 06:24:07 INFO Client: Submitting application to ASM 14/05/20 06:24:07 INFO YarnClientImpl: Submitted application application_1400563733088_0012 14/05/20 06:24:08 INFO Client: Application report from ASM: <THIS PART GET REPEATING FOREVER> application identifier: application_1400563733088_0012 appId: 12 clientToAMToken: null appDiagnostics: appMasterHost: N/A appQueue: root.vagrant appMasterRpcPort: -1 appStartTime: 1400567047343 yarnAppState: ACCEPTED distributedFinalState: UNDEFINED appTrackingUrl: http://vm-cluster-node2:8088/proxy/application_1400563733088_0012/ appUser: vagrant Log files give me no additional help. Latest log entry just acknowledges the status change: hadoop-yarn/hadoop-cmf-yarn-RESOURCEMANAGER-vm-cluster-node2.log.out:2014-05-20 06:24:07,347 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1400563733088_0012 State change from SUBMITTED to ACCEPTED I’m running the example in local test environment with three virtual nodes in Cloudera (CDH5). Below is the run_pi.sh : #!/bin/bash export SPARK_HOME=/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/spark export STANDALONE_SPARK_MASTER_HOST=vm-cluster-node2 export SPARK_MASTER_PORT=7077 export DEFAULT_HADOOP_HOME=/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/hadoop export SPARK_JAR_HDFS_PATH=/user/spark/share/lib/spark-assembly.jar export SPARK_LAUNCH_WITH_SCALA=0 export SPARK_LIBRARY_PATH=${SPARK_HOME}/lib export SCALA_LIBRARY_PATH=${SPARK_HOME}/lib export SPARK_MASTER_IP=$STANDALONE_SPARK_MASTER_HOST export HADOOP_HOME=${HADOOP_HOME:-$DEFAULT_HADOOP_HOME} if [ -n "$HADOOP_HOME" ]; then export SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:${HADOOP_HOME}/lib/native fi export SPARK_JAR=hdfs://vm-cluster-node2:8020/user/spark/share/lib/spark-assembly.jar APP_JAR=/opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/spark/assembly/lib/spark-assembly_2.10-0.9.0-cdh5.0.0-hadoop2.3.0-cdh5.0.0.jar $SPARK_HOME/bin/spark-class org.apache.spark.deploy.yarn.Client \ --jar $APP_JAR \ --class org.apache.spark.examples.SparkPi \ --args yarn-standalone \ --args 10 \ --num-workers 1 \ --master-memory 1g \ --worker-memory 500m \ --worker-cores 1