Reamer edited a comment on pull request #4097: URL: https://github.com/apache/zeppelin/pull/4097#issuecomment-825479159
> Although the image size may look a little big, the physical size or downloading size would be the same of using the current python interpreter image and then download the python environment via whatever approach. You are right, the final disk size would be the same. It is even possible that it is lower, because if the same Python interpreter is started twice, they both use the same image layer. But the image approach has big disadvantages in my eyes. 1) If we but the python environment (conda environment) into the image, then the user must select the image. I don't currently know of any way to check whether the image is present, which makes error handling in K8s very difficult. I think the error handling inside the zeppelin interpreter is much easier. 2) For pyspark you need to create two images. A Zeppelin interpreter image (with the Python environment, Spark and the Zeppelin interpreter) and a Spark executor image (with the Python environment and Spark). Do you know if it is possible to set a URL for `spark.archives`? If not, then there is no other way to put the specific Python environment into the image. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org