What you are explaining is right for yarn-client mode, but the question is about yarn-cluster in which case the spark driver is also submitted and run in one of the node managers
On Tue, 7 Jun 2016, 13:45 Mich Talebzadeh, <mich.talebza...@gmail.com> wrote: > can you elaborate on the above statement please. > > When you start yarn you start the resource manager daemon only on the > resource manager node > > yarn-daemon.sh start resourcemanager > > Then you start nodemanager deamons on all nodes > > yarn-daemon.sh start nodemanager > > A spark app has to start somewhere. That is SparkSubmit. and that is > deterministic. I start SparkSubmit that talks to Yarn Resource Manager that > initialises and registers an Application master. The crucial point is Yarn > Resource manager which is basically a resource scheduler. It optimizes for > cluster resource utilization to keep all resources in use all the time. > However, resource manager itself is on the resource manager node. > > Now I always start my Spark app on the same node as the resource manager > node and let Yarn take care of the rest. > > Thanks > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 7 June 2016 at 12:17, Jacek Laskowski <ja...@japila.pl> wrote: > >> Hi, >> >> It's not possible. YARN uses CPU and memory for resource constraints and >> places AM on any node available. Same about executors (unless data locality >> constraints the placement). >> >> Jacek >> On 6 Jun 2016 1:54 a.m., "Saiph Kappa" <saiph.ka...@gmail.com> wrote: >> >>> Hi, >>> >>> In yarn-cluster mode, is there any way to specify on which node I want >>> the driver to run? >>> >>> Thanks. >>> >> >