@yohann Looks like something is wrong with my environment which I am yet to figure out but the theory so far makes sense and I had also tried it in another environments with very minimal configuration like my environment and it works fine so clearly something is wrong with my env I don't know why the node automatically is going to unhealthy state and INFO logs don't tell me why.
On Sun, Jul 8, 2018 at 7:36 PM, kant kodali <kanth...@gmail.com> wrote: > @yohann Thanks for shining some light! It is making more sense now. > > I think you are correct when you stated: "Your application master is just > asking for more resources than the default Yarn queue is allowed to provide > ". > > Attached are the screenshots of the UI pages you mentioned. The thing that > catches my eye is the default queue resources under scheduler section. It > has the following > > The default queue as listed in the screenshot has the following > > > Max Application Master Resources: <memory:0, vCores:0> > Used Application Master Resources: <memory:1024, vCores:1> > > is this why my spark-shell gets stuck in ACCEPTED stated forever? I am > pretty much using the default config so is there a config I should add to > set the Max Application Master Resources? > > Thanks! > > > > > > On Sun, Jul 8, 2018 at 10:27 AM, yohann jardin <yohannjar...@hotmail.com> > wrote: > >> When you run on Yarn, you don’t even need to start a spark cluster (spark >> master and slaves). Yarn receives a job and then allocate resources for the >> application master and then its workers. >> >> Check the resources available in the node section of the resource manager >> UI (and is your node actually detected as alive?), as well as the scheduler >> section to check the default queue resources. >> If you seem to lack resources for your driver, you can try to reduce the >> driver memory by specifying “--driver-memory 512” for example, but I’d >> expect the default of 1g to be low enough based on what you showed us. >> >> *Yohann Jardin* >> Le 7/8/2018 à 6:11 PM, kant kodali a écrit : >> >> @yohann sorry I am assuming you meant application master if so I believe >> spark is the one that provides application master. Is there anyway to look >> for how much resources are being requested and how much yarn is allowed to >> provide? I would assume this is a common case if so I am not sure why these >> numbers are not part of resource manager logs? >> >> On Sun, Jul 8, 2018 at 8:09 AM, kant kodali <kanth...@gmail.com> wrote: >> >>> yarn.scheduler.capacity.maximum-am-resource-percent by default is set >>> to 0.1 and I tried changing it to 1.0 and still no luck. same problem >>> persists. The master here is yarn and I just trying to spawn spark-shell >>> --master yarn --deploy-mode client and run a simple world count so I am not >>> sure why it would request for more resources? >>> >>> On Sun, Jul 8, 2018 at 8:02 AM, yohann jardin <yohannjar...@hotmail.com> >>> wrote: >>> >>>> Following the logs from the resource manager: >>>> >>>> 2018-07-08 07:23:23,382 WARN org.apache.hadoop.yarn.server. >>>> resourcemanager.scheduler.capacity.LeafQueue: >>>> maximum-am-resource-percent is insufficient to start a single >>>> application in queue, it is likely set too low. skipping enforcement >>>> to allow at least one application to start >>>> >>>> 2018-07-08 07:23:23,382 WARN org.apache.hadoop.yarn.server. >>>> resourcemanager.scheduler.capacity.LeafQueue: >>>> maximum-am-resource-percent is insufficient to start a single >>>> application in queue for user, it is likely set too low. skipping >>>> enforcement to allow at least one application to start >>>> >>>> I’d say it has nothing to do with spark. Your master is just asking >>>> more resources than the default Yarn queue is allowed to provide. >>>> You might take a look at https://hadoop.apache.org/docs >>>> /r2.7.3/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html and search >>>> for maximum-am-resource-percent. >>>> >>>> Regards, >>>> >>>> *Yohann Jardin* >>>> Le 7/8/2018 à 4:40 PM, kant kodali a écrit : >>>> >>>> Hi, >>>> >>>> It's on local mac book pro machine that has 16GB RAM 512GB disk and 8 >>>> vCpu! I am not running any code since I can't even spawn spark-shell with >>>> yarn as master as described in my previous email. I just want to run simple >>>> word count using yarn as master. >>>> >>>> Thanks! >>>> >>>> Below is the resource manager log once again if that helps >>>> >>>> >>>> 2018-07-08 07:23:23,343 INFO org.apache.hadoop.yarn.server. >>>> resourcemanager.scheduler.capacity.ParentQueue: Application added - >>>> appId: application_1531059242261_0001 user: xxx leaf-queue of parent: root >>>> #applications: >>>> 1 >>>> >>>> 2018-07-08 07:23:23,344 INFO org.apache.hadoop.yarn.server. >>>> resourcemanager.scheduler.capacity.CapacityScheduler: Accepted >>>> application application_1531059242261_0001 from user: xxx, in queue: >>>> default >>>> >>>> 2018-07-08 07:23:23,350 INFO org.apache.hadoop.yarn.server. >>>> resourcemanager.rmapp.RMAppImpl: application_1531059242261_0001 State >>>> change from SUBMITTED to ACCEPTED on event=APP_ACCEPTED >>>> >>>> 2018-07-08 07:23:23,370 INFO org.apache.hadoop.yarn.server. >>>> resourcemanager.ApplicationMasterService: Registering app attempt : >>>> appattempt_1531059242261_0001_000001 >>>> >>>> 2018-07-08 07:23:23,370 INFO org.apache.hadoop.yarn.server. >>>> resourcemanager.rmapp.attempt.RMAppAttemptImpl: >>>> appattempt_1531059242261_0001_000001 State change from NEW to SUBMITTED >>>> >>>> 2018-07-08 07:23:23,382 WARN org.apache.hadoop.yarn.server. >>>> resourcemanager.scheduler.capacity.LeafQueue: >>>> maximum-am-resource-percent is insufficient to start a single >>>> application in queue, it is likely set too low. skipping enforcement >>>> to allow at least one application to start >>>> >>>> 2018-07-08 07:23:23,382 WARN org.apache.hadoop.yarn.server. >>>> resourcemanager.scheduler.capacity.LeafQueue: >>>> maximum-am-resource-percent is insufficient to start a single >>>> application in queue for user, it is likely set too low. skipping >>>> enforcement to allow at least one application to start >>>> >>>> 2018-07-08 07:23:23,382 INFO org.apache.hadoop.yarn.server. >>>> resourcemanager.scheduler.capacity.LeafQueue: Application >>>> application_1531059242261_0001 from user: xxx activated in queue: >>>> default >>>> >>>> 2018-07-08 07:23:23,382 INFO org.apache.hadoop.yarn.server. >>>> resourcemanager.scheduler.capacity.LeafQueue: Application added - >>>> appId: application_1531059242261_0001 user: org.apache.hadoop.yarn.server. >>>> resourcemanager.scheduler.capacity.LeafQueue$User@476750cd, >>>> leaf-queue: default #user-pending-applications: 0 >>>> #user-active-applications: 1 #queue-pending-applications: 0 >>>> #queue-active-applications: 1 >>>> >>>> 2018-07-08 07:23:23,382 INFO org.apache.hadoop.yarn.server. >>>> resourcemanager.scheduler.capacity.CapacityScheduler: Added >>>> Application Attempt appattempt_1531059242261_0001_000001 to scheduler >>>> from user xxx in queue default >>>> >>>> 2018-07-08 07:23:23,386 INFO org.apache.hadoop.yarn.server. >>>> resourcemanager.rmapp.attempt.RMAppAttemptImpl: >>>> appattempt_1531059242261_0001_000001 State change from SUBMITTED to >>>> SCHEDULED >>>> >>>> >>>> >>>> >>> >> >> >