Hello,

I am submitting a spark application on spark yarn using the cluster
execution mode.
The application itself depends on a couple of jars. I can successfully
submit and run the application using spark-submit --jars option as seen
below:

spark-submit \
--name Yarn-App \
--class <FQN.Class> \
--properties-file conf/yarn.properties \
--jars lib/<first.jar>,lib/<second.jar>,lib/<third.jar> \
<application.jar> > log/yarn-app.txt 2>&1


With the yarn.properties being something like:

# Spark submit config which used in conjunction with yarn cluster mode
of execution to not block spark-submit command
# for application completion.
spark.yarn.submit.waitAppCompletion=false
spark.submit.deployMode=cluster
spark.master=yarn

## General Spark Application properties
spark.driver.cores=2
spark.driver.memory=4G
spark.executor.memory=5G
spark.executor.cores=2
spark.driver.extraJavaOptions=-Xms2G
spark.driver.extraClassPath=<first.jar>:<second.jar>:<third.jar>
spark.executor.heartbeatInterval=30s

spark.shuffle.service.enabled=true
spark.dynamicAllocation.enabled: True
spark.dynamicAllocation.minExecutors: 1
spark.dynamicAllocation.maxExecutors: 100
spark.dynamicAllocation.initialExecutors: 10
spark.kryo.referenceTracking=false
spark.kryoserializer.buffer.max=1G

spark.ui.showConsoleProgress=true
spark.yarn.am.cores=4
spark.yarn.am.memory=10G
spark.yarn.archive=<HDFS path to spark-only jars>
spark.yarn.historyServer.address=<url to history server>


However, I would like to have everyting specified in the properties file to
simplify the work of my team and not force them to specify the jars every
time.
So my question is what is the spark.property that replaces the spark-submit
*--jars* parameter such that I can specify everything in properties file?

I've tried creating a tar.gz with the contents of the archive
specified in *spark.yarn.archive
+ *the extra 3 jars that I need, upload that to HDFS and change the archive
property but it did not work.
I got class not defined exceptions on classes that come from the 3 extra
jars.

If it helps, the jars are only required for the driver not the executors.
They will simply perform spark-only operations.

Thank you and have good weekend.

--

*Pedro Cardoso*

*Research Engineer*

pedro.card...@feedzai.com


[image: Follow Feedzai on Facebook.] <https://www.facebook.com/Feedzai/>[image:
Follow Feedzai on Twitter!] <https://twitter.com/feedzai>[image: Connect
with Feedzai on LinkedIn!] <https://www.linkedin.com/company/feedzai/>


[image: Feedzai best in class aite report]
<https://feedzai.com/press-releases/aite-group-names-feedzai-market-leader/>

*The content of this email is confidential and intended for the recipient
specified in message only. It is strictly prohibited to share any part of
this message with any third party, without a written consent of the sender.
If you received this message by mistake, please reply to this message and
follow with its deletion, so that we can ensure such a mistake does not
occur in the future.*

-- 
The content of this email is confidential and 
intended for the recipient 
specified in message only. It is strictly 
prohibited to share any part of 
this message with any third party, 
without a written consent of the 
sender. If you received this message by
 mistake, please reply to this 
message and follow with its deletion, so 
that we can ensure such a mistake 
does not occur in the future.

Reply via email to