Ahyoung created ZEPPELIN-1332:
---------------------------------

             Summary: Removing spark-dependencies
                 Key: ZEPPELIN-1332
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1332
             Project: Zeppelin
          Issue Type: Improvement
            Reporter: Ahyoung
            Assignee: Ahyoung
             Fix For: 0.7.0


*Why?*
The latest version of Zeppelin whole package size is over 500MB. More and more 
interpreters are added, the size becomes bigger. Comparing to Spark binary 
package size(spark-2.0.0-bin-hadoop2.7.tgz is 178MB & 
spark-2.0.0-bin-without-hadoop.tzg is 109MB), Zeppelin package size is quite 
huge. And ㅡany Spark interpreter users are using their own Spark not Zeppelin's 
embedded one. So they don't need to include spark-dependencies. Actually the 
first possibility was suggested in 
[PR#1115|https://github.com/apache/zeppelin/pull/1115] by [~jongyoul] regarding 
this issue.

*New suggestion*
 I know Zeppelin's embedded Spark is very useful to Zeppelin beginner. Because 
they don't need to download Spark or set SPARK_HOME by themselves when they 
want to use Spark interpreter in Zeppelin. So I would like to suggest to 
download Spark binary package(maybe spark-2.0.0-bin-hadoop2.7.tgz?) from mirror 
site using shell script instead of just removing spark-dependencies/pom.xml. 
This shell script will check the existence of SPARK_HOME. If SPARK_HOME isn't 
set yet, then download Spark binary package when users start Zeppelin daemon. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to