Ahyoung created ZEPPELIN-1332: --------------------------------- Summary: Removing spark-dependencies Key: ZEPPELIN-1332 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1332 Project: Zeppelin Issue Type: Improvement Reporter: Ahyoung Assignee: Ahyoung Fix For: 0.7.0
*Why?* The latest version of Zeppelin whole package size is over 500MB. More and more interpreters are added, the size becomes bigger. Comparing to Spark binary package size(spark-2.0.0-bin-hadoop2.7.tgz is 178MB & spark-2.0.0-bin-without-hadoop.tzg is 109MB), Zeppelin package size is quite huge. And ㅡany Spark interpreter users are using their own Spark not Zeppelin's embedded one. So they don't need to include spark-dependencies. Actually the first possibility was suggested in [PR#1115|https://github.com/apache/zeppelin/pull/1115] by [~jongyoul] regarding this issue. *New suggestion* I know Zeppelin's embedded Spark is very useful to Zeppelin beginner. Because they don't need to download Spark or set SPARK_HOME by themselves when they want to use Spark interpreter in Zeppelin. So I would like to suggest to download Spark binary package(maybe spark-2.0.0-bin-hadoop2.7.tgz?) from mirror site using shell script instead of just removing spark-dependencies/pom.xml. This shell script will check the existence of SPARK_HOME. If SPARK_HOME isn't set yet, then download Spark binary package when users start Zeppelin daemon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)