In addition to previous emails, when i try to execute this command from
command line:

./bin/spark-submit --verbose --master yarn-cluster --py-files
 mypython/libs/numpy-1.9.2.zip --deploy-mode cluster
mypython/scripts/kmeans.py /kmeans_data.txt 5 1.0


- numpy-1.9.2.zip - is downloaded numpy package
- kmeans.py is default example which comes with Spark 1.4
- kmeans_data.txt  - is default data file which comes with Spark 1.4


It fails saying that it could not find numpy:

File "kmeans.py", line 31, in <module>
    import numpy
ImportError: No module named numpy

Has anyone run Python Spark application on Yarn-cluster mode ? (which has
3rd party Python modules to be shipped with)

What are the configurations or installations to be done before running
Python Spark job with 3rd party dependencies on Yarn-cluster ?

Thanks in advance.

On Thu, Jun 25, 2015 at 12:09 PM, Elkhan Dadashov <elkhan8...@gmail.com>
wrote:

> Hi all,
>
> Does Spark 1.4 version support Python applications on Yarn-cluster ?
> (--master yarn-cluster)
>
> Does Spark 1.4 version support Python applications with deploy-mode
> cluster ? (--deploy-mode cluster)
>
> How can we ship 3rd party Python dependencies with Python Spark job ?
> (running on Yarn cluster)
>
> Thanks.
>
>
>
>
>
>
> On Wed, Jun 24, 2015 at 3:13 PM, Elkhan Dadashov <elkhan8...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I'm trying to run kmeans.py Spark example on Yarn cluster mode. I'm using
>> Spark 1.4.0.
>>
>> I'm passing numpy-1.9.2.zip with --py-files flag.
>>
>> Here is the command I'm trying to execute but it fails:
>>
>> ./bin/spark-submit --master yarn-cluster --verbose  --py-files
>>    mypython/libs/numpy-1.9.2.zip mypython/scripts/kmeans.py
>> /kmeans_data.txt 5 1.0
>>
>>
>> - I have kmeans_data.txt in HDFS in / directory.
>>
>>
>> I receive this error:
>>
>> "
>> ...
>> 15/06/24 15:08:21 INFO yarn.ApplicationMaster: Final app status:
>> SUCCEEDED, exitCode: 0, (reason: Shutdown hook called before final status
>> was reported.)
>> 15/06/24 15:08:21 INFO yarn.ApplicationMaster: Unregistering
>> ApplicationMaster with SUCCEEDED (diag message: Shutdown hook called before
>> final status was reported.)
>> 15/06/24 15:08:21 INFO yarn.ApplicationMaster: Deleting staging directory
>> .sparkStaging/application_1435182120590_0009
>> 15/06/24 15:08:22 INFO util.Utils: Shutdown hook called
>> \00 stdout\00 134Traceback (most recent call last):
>>   File "kmeans.py", line 31, in <module>
>>     import numpy as np
>> ImportError: No module named numpy
>> ...
>>
>> "
>>
>> Any idea why it cannot import numpy-1.9.2.zip while running kmeans.py
>> example provided with Spark ?
>>
>> How can we run python script which has other 3rd-party python module
>> dependency on yarn-cluster ?
>>
>> Thanks.
>>
>>
>
>
> --
>
> Best regards,
> Elkhan Dadashov
>

Reply via email to