I've set up the EC2 cluster with Spark. Everything works, all master/slaves
are up and running.
I'm trying to submit a sample job (SparkPi). When I ssh to cluster and
submit it from there - everything works fine. However when driver is created
on a remote host (my laptop), it doesn't work. I've tried both modes for
`--deploy-mode`:
**`--deploy-mode=client`:**
>From my laptop:
./bin/spark-submit --master
spark://ec2-52-10-82-218.us-west-2.compute.amazonaws.com:7077 --class
SparkPi ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar
Results in the following indefinite warnings/errors:
> WARN TaskSchedulerImpl: Initial job has not accepted any resources;
> check your cluster UI to ensure that workers are registered and have
> sufficient memory 15/02/22 18:30:45
> ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 0
> 15/02/22 18:30:45
> ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 1
...and failed drivers - in Spark Web UI "Completed Drivers" with
"State=ERROR" appear.
I've tried to pass limits for cores and memory to submit script but it
didn't help...
**`--deploy-mode=cluster`:**
>From my laptop:
./bin/spark-submit --master
spark://ec2-52-10-82-218.us-west-2.compute.amazonaws.com:7077 --deploy-mode
cluster --class SparkPi ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar
The result is:
> .... Driver successfully submitted as driver-20150223023734-0007 ...
> waiting before polling master for driver state ... polling master for
> driver state State of driver-20150223023734-0007 is ERROR Exception
> from cluster was: java.io.FileNotFoundException: File
> file:/home/oleg/spark/spark12/ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar
> does not exist. java.io.FileNotFoundException: File
> file:/home/oleg/spark/spark12/ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar
> does not exist. at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:329) at
> org.apache.spark.deploy.worker.DriverRunner.org$apache$spark$deploy$worker$DriverRunner$$downloadUserJar(DriverRunner.scala:150)
> at
> org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:75)
So, I'd appreciate any pointers on what is going wrong and some guidance
how to deploy jobs from remote client. Thanks.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Submitting-jobs-to-Spark-EC2-cluster-remotely-tp21762.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]