Hi all,

Does Spark 1.4 version support Python applications on Yarn-cluster ?
(--master yarn-cluster)

Does Spark 1.4 version support Python applications with deploy-mode cluster
? (--deploy-mode cluster)

How can we ship 3rd party Python dependencies with Python Spark job ?
(running on Yarn cluster)

Thanks.






On Wed, Jun 24, 2015 at 3:13 PM, Elkhan Dadashov <elkhan8...@gmail.com>
wrote:

> Hi all,
>
> I'm trying to run kmeans.py Spark example on Yarn cluster mode. I'm using
> Spark 1.4.0.
>
> I'm passing numpy-1.9.2.zip with --py-files flag.
>
> Here is the command I'm trying to execute but it fails:
>
> ./bin/spark-submit --master yarn-cluster --verbose  --py-files
>    mypython/libs/numpy-1.9.2.zip mypython/scripts/kmeans.py
> /kmeans_data.txt 5 1.0
>
>
> - I have kmeans_data.txt in HDFS in / directory.
>
>
> I receive this error:
>
> "
> ...
> 15/06/24 15:08:21 INFO yarn.ApplicationMaster: Final app status:
> SUCCEEDED, exitCode: 0, (reason: Shutdown hook called before final status
> was reported.)
> 15/06/24 15:08:21 INFO yarn.ApplicationMaster: Unregistering
> ApplicationMaster with SUCCEEDED (diag message: Shutdown hook called before
> final status was reported.)
> 15/06/24 15:08:21 INFO yarn.ApplicationMaster: Deleting staging directory
> .sparkStaging/application_1435182120590_0009
> 15/06/24 15:08:22 INFO util.Utils: Shutdown hook called
> \00 stdout\00 134Traceback (most recent call last):
>   File "kmeans.py", line 31, in <module>
>     import numpy as np
> ImportError: No module named numpy
> ...
>
> "
>
> Any idea why it cannot import numpy-1.9.2.zip while running kmeans.py
> example provided with Spark ?
>
> How can we run python script which has other 3rd-party python module
> dependency on yarn-cluster ?
>
> Thanks.
>
>


-- 

Best regards,
Elkhan Dadashov

Reply via email to