Reamer edited a comment on pull request #4097:
URL: https://github.com/apache/zeppelin/pull/4097#issuecomment-825479159


   > Although the image size may look a little big, the physical size or 
downloading size would be the same of using the current python interpreter 
image and then download the python environment via whatever approach.
   
   You are right, the final disk size would be the same. It is even possible 
that it is lower, because if the same Python interpreter is started twice, they 
both use the same image layer.
   But the image approach has big disadvantages in my eyes.
   1) If we but the python environment (conda environment) into the image, then 
the user must select the image. I don't currently know of any way to check 
whether the image is present, which makes error handling in K8s very difficult. 
I think the error handling inside the zeppelin interpreter is much easier.
   2) For pyspark you need to create two images. A Zeppelin interpreter image 
(with the Python environment, Spark and the Zeppelin interpreter) and a Spark 
executor image (with the Python environment and Spark).
   
   Do you know if it is possible to set a URL for `spark.archives`? If not, 
then there is no other way to put the specific Python environment into the 
image.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to