[GitHub] zeppelin issue #1339: [ZEPPELIN-1332] Remove spark-dependencies & suggest ne...

AhyoungRyu Thu, 10 Nov 2016 04:59:50 -0800

Github user AhyoungRyu commented on the issue:

    https://github.com/apache/zeppelin/pull/1339
  
    ### To whom may concern about the breaking current UX with this change
    
    This change has many benefits comparing to current embedded Spark as I 
wrote in the PR description (and @tae-jun mentioned in [this 
comment](https://github.com/apache/zeppelin/pull/1339#issuecomment-259486249) 
as well. Thanks!).
    But as always, this kind of big change brings downside as well (e.g. 
breaking current UX). So I wanna write down how we can address some major cases 
as below. I think it would be better to share my opinion and get more feedback 
before merging. :)
    
    1. New Spark/Zeppelin user, running Zeppelin for the first time
     : Quite easy to cover and already handled by updating the related docs 
pages I guess.
    
    2. Existing Spark/Zeppelin user, running new Zeppelin installation (e.g. 
upgrading version)
     :  Definitely this case is harder to handle than 1. As the user already 
has expectation, that local mode will **just works** and surely they won't read 
the docs. To resolve this, Iâll update `bin/download-spark.sh` to print sth 
like âYou donât have local-spark/, you can download embedded Spark with 
`get-spark` option.â When the user run  `./bin/zeppelin-daemon.sh start`. And 
this sentences can be removed in the future when Zeppelin users can be getting 
accustomed with `get-spark` option.
    
    3. Docker user, starting `bin/zeppelin.sh` inside the container
    : This one can be also hard to handle because the user might assume that 
Spark just works. So I would suggest start applying this change to #1538 as a 
first step. Since it can be a Zeppelin-provided official docker script.
    
    4. CI issue
    Since @bzz raised some concern about CI issue, let me answer again in here 
to make sure :)
    The reason I removed `-Ppyspark` in `.travis` is `pyspark` profile is only 
existed in `spark-dependencies/pom.xml`. So `pyspark` profile wonât be 
anymore after this PR merged. Actually the Pyspark testcase that @astroshim 
added recently had some conflict with this change. But we solved by simply 
adding `export SPARK_HOME=`pwd`/spark-$SPARK_VER-bin-hadoop$HADOOP_VER` to 
`.travis.yml` so that travis can run it before running the script. So there are 
no more CI issues especially concerning about removing `spark-dependencies` 
related build profiles.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1339: [ZEPPELIN-1332] Remove spark-dependencies & suggest ne...

Reply via email to