Thank you for verifying. Glad to help :)

On Tue, Feb 23, 2016 at 3:51 AM Ian Maloney <rachmaninovquar...@gmail.com>
wrote:

> Hi Mina,
>
> I added your changes and they got the pyspark interpreter working! Thanks
> so much for your help!
>
> Ian
>
>
> On Sunday, February 21, 2016, mina lee <mina...@apache.org> wrote:
>
>> Hi Ian, sorry for late reply.
>> I was able to reproduce the same error with spark 1.4.1 & hadoop
>> 2.6.0. Turned out it was bug from Zeppelin.
>> After some search, I realized that `spark.yarn.isPython` property is
>> introduced since 1.5.0. I just made a PR(
>> https://github.com/apache/incubator-zeppelin/pull/736) to fix it. It
>> will be really appreciated if you can try it and see if it works. Thank you
>> for reporting bug!
>>
>> Regard,
>> Mina
>>
>> On Thu, Feb 18, 2016 at 2:39 AM, Ian Maloney <
>> rachmaninovquar...@gmail.com> wrote:
>>
>>> Hi Mina,
>>>
>>> Thanks for the response. I recloned the master from github and built
>>> using:
>>> mvn clean package -DskipTests -Pspark-1.4 -Phadoop-2.6 -Pyarn -Ppyspark
>>>
>>> I did that locally then scped to a node in a cluster running HDP 2.3
>>> (spark 1.4.1 & hadoop 2.7.1).
>>>
>>> I added the two config files from below and started the Zeppelin daemon.
>>> Inspecting the spark.yarn.isPython config in the spark UI, showed it to be
>>> "true".
>>>
>>> The pyspark interpreter gives the same error as before. Are there any
>>> other configs I should check? I'm beginning to wonder if it's related to
>>> something in Hortonworks' distribution of spark or yarn.
>>>
>>>
>>>
>>> On Tuesday, February 16, 2016, mina lee <mina...@apache.org> wrote:
>>>
>>>> Hi Ian,
>>>>
>>>> The log stack looks quite similar with
>>>> https://issues.apache.org/jira/browse/ZEPPELIN-572 which has fixed
>>>> since v0.5.6
>>>> This happens when pyspark.zip and py4j-*.zip are not distributed to
>>>> yarn worker nodes.
>>>>
>>>> If you are building from source code can you please double check that
>>>> you pulled the latest master?
>>>> And also to be sure can you confirm that if you can see
>>>> spark.yarn.isPython set to be true in Spark UI(Yarn's ApplicationMaster UI)
>>>> > Environment > Spark Properties?
>>>>
>>>> On Sat, Feb 13, 2016 at 1:04 AM, Ian Maloney <
>>>> rachmaninovquar...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I've been trying unsuccessfully to configure the pyspark interpreter
>>>>> on Zeppelin. I can use pyspark from the CLI and can use the Spark
>>>>> interpreter from Zeppelin without issue. Here are the lines which aren't
>>>>> commented out in my zeppelin-env.sh file:
>>>>>
>>>>> export MASTER=yarn-client
>>>>>
>>>>> export ZEPPELIN_PORT=8090
>>>>>
>>>>> export ZEPPELIN_JAVA_OPTS="-Dhdp.version=2.3.2.0-2950
>>>>> -Dspark.yarn.queue=default"
>>>>>
>>>>> export SPARK_HOME=/usr/hdp/current/spark-client/
>>>>>
>>>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>>>
>>>>> export PYSPARK_PYTHON=/usr/bin/python
>>>>>
>>>>> export
>>>>> PYTHONPATH=${SPARK_HOME}/python:${SPARK_HOME}/python/build:$PYTHONPATH
>>>>>
>>>>> Running a simple pyspark script in the interpreter gives this error:
>>>>>
>>>>>   1.  Py4JJavaError: An error occurred while calling
>>>>> z:org.apache.spark.api.python.PythonRDD.runJob.
>>>>>   2.  : org.apache.spark.SparkException: Job aborted due to stage
>>>>> failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost 
>>>>> task
>>>>> 0.3 in stage 1.0 (TID 5, some_yarn_node.networkname):
>>>>> org.apache.spark.SparkException:
>>>>>   3.  Error from python worker:
>>>>>   4.    /usr/bin/python: No module named pyspark
>>>>>   5.  PYTHONPATH was:
>>>>>   6.
>>>>> /app/hadoop/yarn/local/usercache/my_username/filecache/4121/spark-assembly-1.4.1.2.3.2.0-2950-hadoop2.7.1.2.3.2.0-2950.jar
>>>>>
>>>>> More details can be found here:
>>>>>
>>>>> https://community.hortonworks.com/questions/16436/cants-get-pyspark-interpreter-to-work-on-zeppelin.html
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Ian
>>>>>
>>>>>
>>>>
>>

Reply via email to