Hi, --master yarn-client is deprecated and you should use --master yarn --deploy-mode client instead. There are two deploy-modes: client (default) and cluster. See http://spark.apache.org/docs/latest/cluster-overview.html.
Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Tue, Jun 7, 2016 at 2:50 PM, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > ok thanks > > so I start SparkSubmit or similar Spark app on the Yarn resource manager > node. > > What you are stating is that Yan may decide to start the driver program in > another node as opposed to the resource manager node > > ${SPARK_HOME}/bin/spark-submit \ > --driver-memory=4G \ > --num-executors=5 \ > --executor-memory=4G \ > --master yarn-client \ > --executor-cores=4 \ > > Due to lack of resources in the resource manager node? What is the > likelihood of that. The resource manager node is the defector master node in > all probability much more powerful than other nodes. Also the node that > running resource manager is also running one of the node manager as well. So > in theory may be in practice may not? > > HTH > > Dr Mich Talebzadeh > > > > LinkedIn > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > > > http://talebzadehmich.wordpress.com > > > > > On 7 June 2016 at 13:20, Sebastian Piu <sebastian....@gmail.com> wrote: >> >> What you are explaining is right for yarn-client mode, but the question is >> about yarn-cluster in which case the spark driver is also submitted and run >> in one of the node managers >> >> >> On Tue, 7 Jun 2016, 13:45 Mich Talebzadeh, <mich.talebza...@gmail.com> >> wrote: >>> >>> can you elaborate on the above statement please. >>> >>> When you start yarn you start the resource manager daemon only on the >>> resource manager node >>> >>> yarn-daemon.sh start resourcemanager >>> >>> Then you start nodemanager deamons on all nodes >>> >>> yarn-daemon.sh start nodemanager >>> >>> A spark app has to start somewhere. That is SparkSubmit. and that is >>> deterministic. I start SparkSubmit that talks to Yarn Resource Manager that >>> initialises and registers an Application master. The crucial point is Yarn >>> Resource manager which is basically a resource scheduler. It optimizes for >>> cluster resource utilization to keep all resources in use all the time. >>> However, resource manager itself is on the resource manager node. >>> >>> Now I always start my Spark app on the same node as the resource manager >>> node and let Yarn take care of the rest. >>> >>> Thanks >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> >>> >>> On 7 June 2016 at 12:17, Jacek Laskowski <ja...@japila.pl> wrote: >>>> >>>> Hi, >>>> >>>> It's not possible. YARN uses CPU and memory for resource constraints and >>>> places AM on any node available. Same about executors (unless data locality >>>> constraints the placement). >>>> >>>> Jacek >>>> >>>> On 6 Jun 2016 1:54 a.m., "Saiph Kappa" <saiph.ka...@gmail.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> In yarn-cluster mode, is there any way to specify on which node I want >>>>> the driver to run? >>>>> >>>>> Thanks. >>> >>> > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org