Hi all, If I want to ship spark-submit script to HDFS. and then call it from HDFS location for starting Spark job, which other files/folders/jars need to be transferred into HDFS with spark-submit script ?
Due to some dependency issues, we can include Spark in our Java application, so instead we will allow limited usage of Spark only with Python files. So if I want to put spark-submit script into HDFS, and call it to execute Spark job in Yarn cluster, what else need to be put into HDFS with it ? (Using Spark only for execution Spark jobs written in Python) Thanks.