Folks

 

One of the big limitations of the current Spark on K8S implementation is that 
it isn’t possible to use local dependencies (SPARK-23153 [1]) i.e. code, JARs, 
data etc that only lives on the submission client.  This basically leaves end 
users with several options on how to actually run their Spark jobs under K8S:

 
Store local dependencies on some external distributed file system e.g. HDFS
Build custom images with their local dependencies
Mount local dependencies into volumes that are mounted by the K8S pods
 

In all cases the onus is on the end user to do the prep work.  Option 1 is 
unfortunately rare in the environments we’re looking to deploy Spark and Option 
2 tends to be a non-starter as many of our customers whitelist approved images 
i.e. custom images are not permitted.

 

Option 3 is more workable but still requires the users to provide a bunch of 
extra config options to configure this for simple cases or rely upon the 
pending pod template feature for complex cases.

 

Ideally this would all just be handled automatically for users in the way that 
all other resource managers do, the K8S backend even did this at one point in 
the downstream fork but after a long discussion [2] this got dropped in favour 
of using Spark standard mechanisms i.e. spark-submit.  Unfortunately this 
apparently was never followed through upon as it doesn’t work with master as of 
today.  Moreover I am unclear how this would work in the case of Spark on K8S 
cluster mode where the driver itself is inside a pod since the spark-submit 
mechanism is based upon copying from the drivers filesystem to the executors 
via a file server on the driver, if the driver is inside a pod it won’t be able 
to see local files on the submission client.  I think this may work out of the 
box with client mode but I haven’t dug into that enough to verify yet.

 

I would like to start work on addressing this problem but to be honest I am 
unclear where to start with this.  It seems using the standard spark-submit 
mechanism is the way to go but I’m not sure how to get around the driver pod 
issue.  I would appreciate any pointers from folks who’ve looked at this 
previously on how and where to start on this.

 

Cheers,

 

Rob

 

[1] https://issues.apache.org/jira/browse/SPARK-23153

[2] 
https://lists.apache.org/thread.html/82b4ae9a2eb5ddeb3f7240ebf154f06f19b830f8b3120038e5d687a1@%3Cdev.spark.apache.org%3E

Reply via email to