It makes scheduling faster. If you have a node that can accommodate 20 containers, and you schedule one container per heartbeat, it would take 20 seconds to schedule all the containers. OTOH if you schedule multiple containers to a heartbeat it is much faster.
- Hari On Mon, 20 May 2019, 15:40 Akshay Bhardwaj, <akshay.bhardwaj1...@gmail.com> wrote: > Hi Hari, > > Thanks for this information. > > Do you have any resources on/can explain, why YARN has this as default > behaviour? What would be the advantages/scenarios to have multiple > assignments in single heartbeat? > > > Regards > Akshay Bhardwaj > +91-97111-33849 > > > On Mon, May 20, 2019 at 1:29 PM Hariharan <hariharan...@gmail.com> wrote: > >> Hi Akshay, >> >> I believe HDP uses the capacity scheduler by default. In the capacity >> scheduler, assignment of multiple containers on the same node is >> determined by the option >> yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled, >> which is true by default. If you would like YARN to spread out the >> containers, you can set this for false. >> >> You can read learn about this and associated parameters here >> - >> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html >> >> ~ Hari >> >> >> On Mon, May 20, 2019 at 11:16 AM Akshay Bhardwaj >> <akshay.bhardwaj1...@gmail.com> wrote: >> > >> > Hi All, >> > >> > Just floating this email again. Grateful for any suggestions. >> > >> > Akshay Bhardwaj >> > +91-97111-33849 >> > >> > >> > On Mon, May 20, 2019 at 12:25 AM Akshay Bhardwaj < >> akshay.bhardwaj1...@gmail.com> wrote: >> >> >> >> Hi All, >> >> >> >> I am running Spark 2.3 on YARN using HDP 2.6 >> >> >> >> I am running spark job using dynamic resource allocation on YARN with >> minimum 2 executors and maximum 6. My job read data from parquet files >> which are present on S3 buckets and store some enriched data to cassandra. >> >> >> >> My question is, how does YARN decide which nodes to launch containers? >> >> I have around 12 YARN nodes running in the cluster, but still i see >> repeated patterns of 3-4 containers launched on the same node for a >> particular job. >> >> >> >> What is the best way to start debugging this reason? >> >> >> >> Akshay Bhardwaj >> >> +91-97111-33849 >> >