Hi Nikhil, Application jar by default is added <https://github.com/apache/spark/blob/777b4502b206b7240c6655d3c0b0a9ce08f6a09c/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L623> to spark.jars so it is fetched by executors when tasks are launched (behind the scenes SparkContext will add the these files to the driver's file server and the TaskSetManager will add them to the tasks and so when tasks are deserialized at the executor side they will have these files uris but with a different scheme, namely spark:// which is the file server's scheme. Then executors will get them with the updateDeps call). You should see something like this:
19/04/19 23:15:05 INFO Executor: Fetching spark://spark-pi-2775276a37e147f8-driver-svc.spark.svc:7078/jars/spark-examples_2.12-3.0.0-SNAPSHOT.jar with timestamp 1555715697141 19/04/19 23:15:05 INFO TransportClientFactory: Successfully created connection to spark-pi-2775276a37e147f8-driver-svc.spark.svc/172.17.0.4:7078 after 4 ms (0 ms spent in bootstraps) 19/04/19 23:15:05 INFO Utils: Fetching spark://spark-pi-2775276a37e147f8-driver-svc.spark.svc:7078/jars/spark-examples_2.12-3.0.0-SNAPSHOT.jar to /var/data/spark-7bb5652a-7289-43b2-8e0a-2b4687eddb51/spark-4fccc535-47e8-49c1-818f-a44eb268f09e/fetchFileTemp951703538232079938.tmp 19/04/19 23:15:05 INFO Utils: Copying /var/data/spark-7bb5652a-7289-43b2-8e0a-2b4687eddb51/spark-4fccc535-47e8-49c1-818f-a44eb268f09e/11085522591555715697141_cache to /opt/spark/work-dir/./spark-examples_2.12-3.0.0-SNAPSHOT.jar Could you add --verbose so we see what args spark-submit gets? What is the value of LINUX_APP_RESOURCE? One trick would be to set spark.jars (or add it manually with sc.add) but still would like to know what is happening, if args.primaryResource is set etc.. The reason you see that output is because for some reason the file uri is passed as is at the executors and executors will try to fetch the file from the local fs where it does not exist. Best, Stavros On Tue, Apr 16, 2019 at 11:29 AM Nikhil Chinnapa < nishant.ran...@renovite.com> wrote: > Environment: > Spark: 2.4.0 > Kubernetes:1.14 > > Query: Does application jar needs to be part of both Driver and Executor > image? > > Invocation point (from Java code): > sparkLaunch = new SparkLauncher() > > .setMaster(LINUX_MASTER) > .setAppResource(LINUX_APP_RESOURCE) > .setConf("spark.app.name",APP_NAME) > .setMainClass(MAIN_CLASS) > > .setConf("spark.executor.instances",EXECUTOR_COUNT) > > .setConf("spark.kubernetes.container.image",CONTAINER_IMAGE) > .setConf("spark.kubernetes.driver.pod.name > ",DRIVER_POD_NAME) > > .setConf("spark.kubernetes.container.image.pullSecrets",REGISTRY_SECRET) > > > .setConf("spark.kubernetes.authenticate.driver.serviceAccountName",SERVICE_ACCOUNT_NAME) > .setConf("spark.driver.host", SERVICE_NAME > + "." + NAMESPACE + > ".svc.cluster.local") > .setConf("spark.driver.port", > DRIVER_PORT) > .setDeployMode("client") > ; > > Scenario: > I am trying to run Spark on K8s in client mode. When I put application jar > image both in driver and executor then program work fines. > > But, if I put application jar in driver image only then I get following > error: > > 2019-04-16 06:36:44 INFO Executor:54 - Fetching > file:/opt/spark/examples/jars/reno-spark-codebase-0.1.0.jar with timestamp > 1555396592768 > 2019-04-16 06:36:44 INFO Utils:54 - Copying > /opt/spark/examples/jars/reno-spark-codebase-0.1.0.jar to > > /var/data/spark-d24c8fbc-4fe7-4968-9310-f891a097d1e7/spark-31ba5cbb-3132-408c-991a-795 > 2019-04-16 06:36:44 ERROR Executor:91 - Exception in task 0.1 in stage 0.0 > (TID 2) > java.nio.file.NoSuchFileException: > /opt/spark/examples/jars/reno-spark-codebase-0.1.0.jar > at > > java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) > at > > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) > at > > java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) > at java.base/sun.nio.fs.UnixCopyFile.copy(UnixCopyFile.java:548) > at > > java.base/sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:254) > at java.base/java.nio.file.Files.copy(Files.java:1294) > at > > org.apache.spark.util.Utils$.org$apache$spark$util$Utils$$copyRecursive(Utils.scala:664) > at org.apache.spark.util.Utils$.copyFile(Utils.scala:635) > at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719) > at org.apache.spark.util.Utils$.fetchFile(Utils.scala:496) > at > > org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:805) > at > > org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:797) > at > > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at > > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) > at > > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130) > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236) > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40) > at scala.collection.mutable.HashMap.foreach(HashMap.scala:130) > at > > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at > org.apache.spark.executor.Executor.org > $apache$spark$executor$Executor$$updateDependencies(Executor.scala:797) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:369) > at > > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > > > > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >