Folks
One of the big limitations of the current Spark on K8S implementation is that it isn’t possible to use local dependencies (SPARK-23153 [1]) i.e. code, JARs, data etc that only lives on the submission client. This basically leaves end users with several options on how to actually run their Spark jobs under K8S: Store local dependencies on some external distributed file system e.g. HDFS Build custom images with their local dependencies Mount local dependencies into volumes that are mounted by the K8S pods In all cases the onus is on the end user to do the prep work. Option 1 is unfortunately rare in the environments we’re looking to deploy Spark and Option 2 tends to be a non-starter as many of our customers whitelist approved images i.e. custom images are not permitted. Option 3 is more workable but still requires the users to provide a bunch of extra config options to configure this for simple cases or rely upon the pending pod template feature for complex cases. Ideally this would all just be handled automatically for users in the way that all other resource managers do, the K8S backend even did this at one point in the downstream fork but after a long discussion [2] this got dropped in favour of using Spark standard mechanisms i.e. spark-submit. Unfortunately this apparently was never followed through upon as it doesn’t work with master as of today. Moreover I am unclear how this would work in the case of Spark on K8S cluster mode where the driver itself is inside a pod since the spark-submit mechanism is based upon copying from the drivers filesystem to the executors via a file server on the driver, if the driver is inside a pod it won’t be able to see local files on the submission client. I think this may work out of the box with client mode but I haven’t dug into that enough to verify yet. I would like to start work on addressing this problem but to be honest I am unclear where to start with this. It seems using the standard spark-submit mechanism is the way to go but I’m not sure how to get around the driver pod issue. I would appreciate any pointers from folks who’ve looked at this previously on how and where to start on this. Cheers, Rob [1] https://issues.apache.org/jira/browse/SPARK-23153 [2] https://lists.apache.org/thread.html/82b4ae9a2eb5ddeb3f7240ebf154f06f19b830f8b3120038e5d687a1@%3Cdev.spark.apache.org%3E