Re: Run a self-contained Spark app on a Spark standalone cluster

Kevin Eid Sat, 16 Apr 2016 09:42:06 -0700

One last email to announce that I've fixed all of the issues. Don't
hesitate to contact me if you encounter the same. I'd be happy to help.


Regards,
Kevin
On 14 Apr 2016 12:39 p.m., "Kevin Eid" <kevin.e...@mail.dcu.ie> wrote:

> Hi all,
>
> I managed to copy my .py files from local to the cluster using SCP . And I
> managed to run my Spark app on the cluster against a small dataset.
>
> However, when I iterate over a dataset of 5GB I got the followings:
> org.apache.spark.shuffle.MetadataFetchFailedException + please see the
> joined screenshots.
>
> I am deploying 3*m3.xlarge and using the following parameters while
> submitting the app: --executor-memory 50g --driver-memory 20g
> --executor-cores 4 --num-executors 3.
>
> Can you recommend other configurations (driver executors number memory) or
> do I have to deploy more and larger instances  in order to run my app on
> 5GB ? Or do I need to add more partitions while reading the file?
>
> Best,
> Kevin
>
> On 12 April 2016 at 12:19, Sun, Rui <rui....@intel.com> wrote:
>
>> Which py file is your main file (primary py file)? Zip the other two py
>> files. Leave the main py file alone. Don't copy them to S3 because it seems
>> that only local primary and additional py files are supported.
>>
>> ./bin/spark-submit --master spark://... --py-files <zip file> <main py
>> file>
>>
>> -----Original Message-----
>> From: kevllino [mailto:kevin.e...@mail.dcu.ie]
>> Sent: Tuesday, April 12, 2016 5:07 PM
>> To: user@spark.apache.org
>> Subject: Run a self-contained Spark app on a Spark standalone cluster
>>
>> Hi,
>>
>> I need to know how to run a self-contained Spark app  (3 python files) in
>> a Spark standalone cluster. Can I move the .py files to the cluster, or
>> should I store them locally, on HDFS or S3? I tried the following locally
>> and on S3 with a zip of my .py files as suggested  here <
>> http://spark.apache.org/docs/latest/submitting-applications.html>  :
>>
>> ./bin/spark-submit --master
>> spark://ec2-54-51-23-172.eu-west-1.compute.amazonaws.com:5080
>> --py-files
>> s3n://AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY@mubucket
>> //weather_predict.zip
>>
>> But get: “Error: Must specify a primary resource (JAR or Python file)”
>>
>> Best,
>> Kevin
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Run-a-self-contained-Spark-app-on-a-Spark-standalone-cluster-tp26753.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
>
> --
> Kevin EID
> M.Sc. in Computing, Data Analytics
> <https://fr.linkedin.com/pub/kevin-eid/85/689/b01>
>
>

Re: Run a self-contained Spark app on a Spark standalone cluster

Reply via email to