Are you able to run a simple Map Reduce job on yarn without any issues? If you have any issues: I had this problem on Mac. Use CSRUTIL in Mac, to disable it. Then add a softlink
sudo ln –s /usr/bin/java /bin/java The new versions of Mac from EL Captain does not allow softlinks in /bin/java. I got everything working by above. Best, Ravion On Sun, Jul 8, 2018 at 10:20 AM Marco Mistroni <mmistr...@gmail.com> wrote: > You running on emr? You checked the emr logs? > Was in similar situation where job was stuck in accepted and then it > died..turned out to be an issue w. My code when running g with huge > data.perhaps try to reduce gradually the load til it works and then start > from there? > Not a huge help but I followed same when. My job was stuck on accepted > Hth > > On Sun, Jul 8, 2018, 2:59 PM kant kodali <kanth...@gmail.com> wrote: > >> Hi All, >> >> I am trying to run a simple word count using YARN as a cluster manager. >> I am currently using Spark 2.3.1 and Apache hadoop 2.7.3. When I spawn >> spark-shell like below it gets stuck in ACCEPTED stated forever. >> >> ./bin/spark-shell --master yarn --deploy-mode client >> >> >> I set my log4j.properties in SPARK_HOME/conf to TRACE >> >> queue: "default" name: "Spark shell" host: "N/A" rpc_port: -1 >> yarn_application_state: ACCEPTED trackingUrl: " >> http://Kants-MacBook-Pro-2.local:8088/proxy/application_1531056583425_0001/" >> diagnostics: "" startTime: 1531056632496 finishTime: 0 >> final_application_status: APP_UNDEFINED app_resource_Usage { >> num_used_containers: 0 num_reserved_containers: 0 used_resources { memory: >> 0 virtual_cores: 0 } reserved_resources { memory: 0 virtual_cores: 0 } >> needed_resources { memory: 0 virtual_cores: 0 } memory_seconds: 0 >> vcore_seconds: 0 } originalTrackingUrl: "N/A" currentApplicationAttemptId { >> application_id { id: 1 cluster_timestamp: 1531056583425 } attemptId: 1 } >> progress: 0.0 applicationType: "SPARK" }} >> >> 18/07/08 06:32:22 INFO Client: Application report for >> application_1531056583425_0001 (state: ACCEPTED) >> >> 18/07/08 06:32:22 DEBUG Client: >> >> client token: N/A >> >> diagnostics: N/A >> >> ApplicationMaster host: N/A >> >> ApplicationMaster RPC port: -1 >> >> queue: default >> >> start time: 1531056632496 >> >> final status: UNDEFINED >> >> tracking URL: >> http://xxx-MacBook-Pro-2.local:8088/proxy/application_1531056583425_0001/ >> >> user: xxx >> >> >> >> 18/07/08 06:32:20 DEBUG Client: >> >> client token: N/A >> >> diagnostics: N/A >> >> ApplicationMaster host: N/A >> >> ApplicationMaster RPC port: -1 >> >> queue: default >> >> start time: 1531056632496 >> >> final status: UNDEFINED >> >> tracking URL: >> http://Kants-MacBook-Pro-2.local:8088/proxy/application_1531056583425_0001/ >> >> user: kantkodali >> >> >> 18/07/08 06:32:21 TRACE ProtobufRpcEngine: 1: Call -> /0.0.0.0:8032: >> getApplicationReport {application_id { id: 1 cluster_timestamp: >> 1531056583425 }} >> >> 18/07/08 06:32:21 DEBUG Client: IPC Client (1608805714) connection to / >> 0.0.0.0:8032 from kantkodali sending #136 >> >> 18/07/08 06:32:21 DEBUG Client: IPC Client (1608805714) connection to / >> 0.0.0.0:8032 from kantkodali got value #136 >> >> 18/07/08 06:32:21 DEBUG ProtobufRpcEngine: Call: getApplicationReport >> took 1ms >> >> 18/07/08 06:32:21 TRACE ProtobufRpcEngine: 1: Response <- /0.0.0.0:8032: >> getApplicationReport {application_report { applicationId { id: 1 >> cluster_timestamp: 1531056583425 } user: "xxx" queue: "default" name: >> "Spark shell" host: "N/A" rpc_port: -1 yarn_application_state: ACCEPTED >> trackingUrl: " >> http://xxx-MacBook-Pro-2.local:8088/proxy/application_1531056583425_0001/" >> diagnostics: "" startTime: 1531056632496 finishTime: 0 >> final_application_status: APP_UNDEFINED app_resource_Usage { >> num_used_containers: 0 num_reserved_containers: 0 used_resources { memory: >> 0 virtual_cores: 0 } reserved_resources { memory: 0 virtual_cores: 0 } >> needed_resources { memory: 0 virtual_cores: 0 } memory_seconds: 0 >> vcore_seconds: 0 } originalTrackingUrl: "N/A" currentApplicationAttemptId { >> application_id { id: 1 cluster_timestamp: 1531056583425 } attemptId: 1 } >> progress: 0.0 applicationType: "SPARK" }} >> >> 18/07/08 06:32:21 INFO Client: Application report for >> application_1531056583425_0001 (state: ACCEPTED) >> >> >> I have read this link >> <https://stackoverflow.com/questions/32658840/spark-shell-stuck-in-yarn-accepted-state> >> and >> here are the conf files that are different from default settings >> >> >> *yarn-site.xml* >> >> >> <configuration> >> >> >> <property> >> >> <name>yarn.nodemanager.aux-services</name> >> >> <value>mapreduce_shuffle</value> >> >> </property> >> >> >> <property> >> >> <name>yarn.nodemanager.resource.memory-mb</name> >> >> <value>16384</value> >> >> </property> >> >> >> <property> >> >> <name>yarn.scheduler.minimum-allocation-mb</name> >> >> <value>256</value> >> >> </property> >> >> >> <property> >> >> <name>yarn.scheduler.maximum-allocation-mb</name> >> >> <value>8192</value> >> >> </property> >> >> >> <property> >> >> <name>yarn.nodemanager.resource.cpu-vcores</name> >> >> <value>8</value> >> >> </property> >> >> >> </configuration> >> >> *core-site.xml* >> >> >> <configuration> >> >> <property> >> >> <name>fs.defaultFS</name> >> >> <value>hdfs://localhost:9000</value> >> >> </property> >> >> </configuration> >> >> *hdfs-site.xml* >> >> >> <configuration> >> >> <property> >> >> <name>dfs.replication</name> >> >> <value>1</value> >> >> </property> >> >> </configuration> >> >> >> you can imagine every other config remains untouched(so everything else >> has default settings) Finally, I have also tried to see if there any clues >> in resource manager logs but they dont seem to be helpful in terms of >> fixing the issue however I am newbie to yarn so please let me know if I >> missed out on something. >> >> >> >> 2018-07-08 06:54:57,345 INFO >> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated >> new applicationId: 1 >> >> 2018-07-08 06:55:09,413 WARN >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The specific >> max attempts: 0 for application: 1 is invalid, because it is out of the >> range [1, 2]. Use the global max attempts instead. >> >> 2018-07-08 06:55:09,414 INFO >> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application >> with id 1 submitted by user xxx >> >> 2018-07-08 06:55:09,415 INFO >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing >> application with id application_1531058076308_0001 >> >> 2018-07-08 06:55:09,416 INFO >> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER >> =kantkodali IP=10.0.0.58 OPERATION=Submit Application Request >> TARGET=ClientRMService RESULT=SUCCESS >> APPID=application_1531058076308_0001 >> >> 2018-07-08 06:55:09,422 INFO >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: >> application_1531058076308_0001 State change from NEW to NEW_SAVING on >> event=START >> >> 2018-07-08 06:55:09,422 INFO >> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: >> Storing info for app: application_1531058076308_0001 >> >> 2018-07-08 06:55:09,423 INFO >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: >> application_1531058076308_0001 State change from NEW_SAVING to SUBMITTED >> on event=APP_NEW_SAVED >> >> 2018-07-08 06:55:09,425 INFO >> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: >> Application added - appId: application_1531058076308_0001 user: >> kantkodali leaf-queue of parent: root #applications: 1 >> >> 2018-07-08 06:55:09,425 INFO >> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: >> Accepted application application_1531058076308_0001 from user: >> kantkodali, in queue: default >> >> 2018-07-08 06:55:09,439 INFO >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: >> application_1531058076308_0001 State change from SUBMITTED to ACCEPTED on >> event=APP_ACCEPTED >> >> 2018-07-08 06:55:09,470 INFO >> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: >> Registering app attempt : appattempt_1531058076308_0001_000001 >> >> 2018-07-08 06:55:09,471 INFO >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: >> appattempt_1531058076308_0001_000001 State change from NEW to SUBMITTED >> >> 2018-07-08 06:55:09,481 WARN >> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: >> maximum-am-resource-percent is insufficient to start a single >> application in queue, it is likely set too low. skipping enforcement to >> allow at least one application to start >> >> 2018-07-08 06:55:09,481 WARN >> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: >> maximum-am-resource-percent is insufficient to start a single >> application in queue for user, it is likely set too low. skipping >> enforcement to allow at least one application to start >> >> 2018-07-08 06:55:09,481 INFO >> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: >> Application application_1531058076308_0001 from user: xxx activated in >> queue: default >> >> 2018-07-08 06:55:09,482 INFO >> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: >> Application added - appId: application_1531058076308_0001 user: >> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$ >> User@fdd759d, leaf-queue: default #user-pending-applications: 0 >> #user-active-applications: >> 1 #queue-pending-applications: 0 #queue-active-applications: 1 >> >> 2018-07-08 06:55:09,482 INFO >> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: >> Added Application Attempt appattempt_1531058076308_0001_000001 to >> scheduler from user kantkodali in queue default >> >> 2018-07-08 06:55:09,484 INFO >> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: >> appattempt_1531058076308_0001_000001 State change from SUBMITTED to >> SCHEDULED >> >> Any help would be great! >> >> Thanks! >> >