I have pushed few commits to make spark.files work for pyspark. Please check https://github.com/apache/incubator-zeppelin/pull/270 and let me know if it helps.
Thanks, moon On Thu, Sep 3, 2015 at 3:25 PM Axel Dahl <[email protected]> wrote: > Also filed a bug on apache-spark: > > https://issues.apache.org/jira/browse/SPARK-10436 > > > > On Thu, Sep 3, 2015 at 8:59 AM, moon soo Lee <[email protected]> wrote: > >> Let me investigate more.. >> >> On Thu, Sep 3, 2015 at 12:13 AM Axel Dahl <[email protected]> wrote: >> >>> doesn't seem to have any effect. :/ Nothing in the logs to indicate it >>> was added and getting the same error when trying to load it. >>> >>> On Wed, Sep 2, 2015 at 11:16 PM, moon soo Lee <[email protected]> wrote: >>> >>>> maybe then you can try this option >>>> >>>> export SPARK_SUBMIT_OPTIONS="--py-files [comma separated list of .zip, >>>> .egg or .py]" >>>> >>>> in conf/zeppelin-env.sh >>>> >>>> Thanks, >>>> moon >>>> >>>> On Wed, Sep 2, 2015 at 10:53 PM Axel Dahl <[email protected]> >>>> wrote: >>>> >>>>> So it seems that, when you call, say: >>>>> >>>>> spark-submit xyz.py >>>>> >>>>> it converts xyz.py into the option "spark.files xyz.py" and because >>>>> "xyz.py" was entered on the command line, it overwrote the "spark.files" >>>>> entry that's in the "spark-defaults.conf". >>>>> >>>>> Is there another way to add py-files via spark-defaults.conf or >>>>> another way to configure zeppelin to always add a set of configured files >>>>> to the spark-submit job? >>>>> >>>>> -Axel >>>>> >>>>> >>>>> On Wed, Sep 2, 2015 at 9:42 PM, Axel Dahl <[email protected]> >>>>> wrote: >>>>> >>>>>> Thanks moon, >>>>>> >>>>>> I set spark.files in SPARK_HOME/conf/spark-defaults.conf >>>>>> >>>>>> and when I run spark/bin/pyspark shell it finds and adds these files, >>>>>> but when I execute /bin/pyspark/spark-submit it doesn't add them. >>>>>> spark-submit does read the spark-defaults.conf (because it does find the >>>>>> spark.master entry), but for some reason ignores the spark.files >>>>>> directive.....very strange since pyspark shell loads them properly. >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Sep 1, 2015 at 11:25 PM, moon soo Lee <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I think changes are come from >>>>>>> https://github.com/apache/incubator-zeppelin/pull/244. >>>>>>> >>>>>>> https://github.com/apache/incubator-zeppelin/pull/270 is not yet >>>>>>> merged, but i suggest try this. It uses spark-submit if you have >>>>>>> SPARK_HOME >>>>>>> defined. You'll just need define your spark.files in >>>>>>> SPARK_HOME/conf/spark-defaults.conf, without adding them into >>>>>>> ZEPPELIN_JAVA_OPTS >>>>>>> >>>>>>> Thanks, >>>>>>> moon >>>>>>> >>>>>>> >>>>>>> On Tue, Sep 1, 2015 at 10:52 PM Axel Dahl <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Downloaded and compiled latest zeppelin. >>>>>>>> >>>>>>>> in my conf/zeppelin-env.sh file I have the following line: >>>>>>>> >>>>>>>> export >>>>>>>> ZEPPELIN_JAVA_OPTS="-Dspark.files=/home/hduser/lib/sparklib.zip,/home/hduser/lib/service.cfg,/home/hduser/lib/helper.py" >>>>>>>> >>>>>>>> This used to work, but when I inspect the folder using >>>>>>>> SparkFile.getRootDirectory(), it doesn't show any of the files in the >>>>>>>> folder. >>>>>>>> >>>>>>>> I have checked that all the files are accessible at the specified >>>>>>>> paths. There's nothing in the logs to indicate that >>>>>>>> "ZEPPELIN_JAVA_OPTS" >>>>>>>> was read, but it looks like other entries are being read (e.g. >>>>>>>> SPARK_HOME). >>>>>>>> >>>>>>>> Did this change from previous versions? >>>>>>>> >>>>>>>> -Axel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>> >>> >
