Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/2899
  
    Thanks @jongyoul Overall I have 2 concerns.
    
    1. Downloading spark in zeppelin-deamons.sh doesn't make sense to me. 
Because some users may not use spark in zeppelin, they may only use jdbc or 
other interpreters. And even for some users that use spark, they may not have 
internet access, then they will fail to start zeppelin. Personally I prefer to 
defer the downloading. E.g. we can download it at the first time of running 
spark interpreter. If user don't specify SPARK_HOME, then we either can 
download it in backend and display proper message in frontend or just display 
error message in frontend that asking user to download spark manually and 
specify SPARK_HOME.
    
    2. Running unit test for PySparkInterpreter & SparkRInterpreter. Running 
them require correct configuration of PYTHONPATH (pyspark.zip & py4j) and 
SparkRLibPath (sparkr.zip). For now, we downloading spark in pom.xml and set 
them properly in pom.xml. But in your PR, you set it in travis script, this 
would make running these unit test locally difficult, especially for new 
contributor if they don't know the underlying mechanisms.  My suggestion is 
that we can download it programatically 
[SparkDownloadUtils](https://github.com/apache/zeppelin/blob/master/zeppelin-zengine/src/test/java/org/apache/zeppelin/interpreter/SparkDownloadUtils.java)
 and set java property `spark.home` to pass to PySparkInterpreter & 
SparkRInterpreter and we treat the same `spark.home` & `SPARK_HOME` in 
[PytonUtils](https://github.com/apache/zeppelin/blob/master/spark/interpreter/src/main/java/org/apache/zeppelin/spark/PythonUtils.java)
 (Because we can not set enviroment variable programmatically). In this case
 , we can run unit test of PySparkInterpreter & SparkRInterpreter easily 
without any extra configuration. 
    
    \cc @felixcheung @Leemoonsoo 
    



---

Reply via email to