I have pushed few commits to make spark.files work for pyspark.
Please check https://github.com/apache/incubator-zeppelin/pull/270 and let
me know if it helps.

Thanks,
moon

On Thu, Sep 3, 2015 at 3:25 PM Axel Dahl <[email protected]> wrote:

> Also filed a bug on apache-spark:
>
> https://issues.apache.org/jira/browse/SPARK-10436
>
>
>
> On Thu, Sep 3, 2015 at 8:59 AM, moon soo Lee <[email protected]> wrote:
>
>> Let me investigate more..
>>
>> On Thu, Sep 3, 2015 at 12:13 AM Axel Dahl <[email protected]> wrote:
>>
>>> doesn't seem to have any effect. :/  Nothing in the logs to indicate it
>>> was added and getting the same error when trying to load it.
>>>
>>> On Wed, Sep 2, 2015 at 11:16 PM, moon soo Lee <[email protected]> wrote:
>>>
>>>> maybe then you can try this option
>>>>
>>>> export SPARK_SUBMIT_OPTIONS="--py-files [comma separated list of .zip,
>>>> .egg or .py]"
>>>>
>>>> in conf/zeppelin-env.sh
>>>>
>>>> Thanks,
>>>> moon
>>>>
>>>> On Wed, Sep 2, 2015 at 10:53 PM Axel Dahl <[email protected]>
>>>> wrote:
>>>>
>>>>> So it seems that, when you call, say:
>>>>>
>>>>> spark-submit xyz.py
>>>>>
>>>>> it converts xyz.py into the option "spark.files   xyz.py" and because
>>>>> "xyz.py" was entered on the command line, it overwrote the "spark.files"
>>>>> entry that's in the "spark-defaults.conf".
>>>>>
>>>>> Is there another way to add py-files via spark-defaults.conf or
>>>>> another way to configure zeppelin to always add a set of configured files
>>>>> to the spark-submit job?
>>>>>
>>>>> -Axel
>>>>>
>>>>>
>>>>> On Wed, Sep 2, 2015 at 9:42 PM, Axel Dahl <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Thanks moon,
>>>>>>
>>>>>> I set spark.files in SPARK_HOME/conf/spark-defaults.conf
>>>>>>
>>>>>> and when I run spark/bin/pyspark shell it finds and adds these files,
>>>>>> but when I execute /bin/pyspark/spark-submit it doesn't add them.
>>>>>> spark-submit does read the spark-defaults.conf (because it does find the
>>>>>> spark.master entry), but for some reason ignores the spark.files
>>>>>> directive.....very strange since pyspark shell loads them properly.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Sep 1, 2015 at 11:25 PM, moon soo Lee <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I think changes are come from
>>>>>>> https://github.com/apache/incubator-zeppelin/pull/244.
>>>>>>>
>>>>>>> https://github.com/apache/incubator-zeppelin/pull/270 is not yet
>>>>>>> merged, but i suggest try this. It uses spark-submit if you have 
>>>>>>> SPARK_HOME
>>>>>>> defined. You'll just need define your spark.files in
>>>>>>> SPARK_HOME/conf/spark-defaults.conf, without adding them into
>>>>>>> ZEPPELIN_JAVA_OPTS
>>>>>>>
>>>>>>> Thanks,
>>>>>>> moon
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Sep 1, 2015 at 10:52 PM Axel Dahl <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Downloaded and compiled latest zeppelin.
>>>>>>>>
>>>>>>>> in my conf/zeppelin-env.sh file I have the following line:
>>>>>>>>
>>>>>>>> export
>>>>>>>> ZEPPELIN_JAVA_OPTS="-Dspark.files=/home/hduser/lib/sparklib.zip,/home/hduser/lib/service.cfg,/home/hduser/lib/helper.py"
>>>>>>>>
>>>>>>>> This used to work, but when I inspect the folder using
>>>>>>>> SparkFile.getRootDirectory(), it doesn't show any of the files in the
>>>>>>>> folder.
>>>>>>>>
>>>>>>>> I have checked that all the files are accessible at the specified
>>>>>>>> paths.  There's nothing in the logs to indicate that  
>>>>>>>> "ZEPPELIN_JAVA_OPTS"
>>>>>>>> was read, but it looks like other entries are being read (e.g. 
>>>>>>>> SPARK_HOME).
>>>>>>>>
>>>>>>>> Did this change from previous versions?
>>>>>>>>
>>>>>>>> -Axel
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>

Reply via email to