I've set up the EC2 cluster with Spark. Everything works, all master/slaves are up and running.
I'm trying to submit a sample job (SparkPi). When I ssh to cluster and submit it from there - everything works fine. However when driver is created on a remote host (my laptop), it doesn't work. I've tried both modes for `--deploy-mode`: **`--deploy-mode=client`:** >From my laptop: ./bin/spark-submit --master spark://ec2-52-10-82-218.us-west-2.compute.amazonaws.com:7077 --class SparkPi ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar Results in the following indefinite warnings/errors: > WARN TaskSchedulerImpl: Initial job has not accepted any resources; > check your cluster UI to ensure that workers are registered and have > sufficient memory 15/02/22 18:30:45 > ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 0 > 15/02/22 18:30:45 > ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 1 ...and failed drivers - in Spark Web UI "Completed Drivers" with "State=ERROR" appear. I've tried to pass limits for cores and memory to submit script but it didn't help... **`--deploy-mode=cluster`:** >From my laptop: ./bin/spark-submit --master spark://ec2-52-10-82-218.us-west-2.compute.amazonaws.com:7077 --deploy-mode cluster --class SparkPi ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar The result is: > .... Driver successfully submitted as driver-20150223023734-0007 ... > waiting before polling master for driver state ... polling master for > driver state State of driver-20150223023734-0007 is ERROR Exception > from cluster was: java.io.FileNotFoundException: File > file:/home/oleg/spark/spark12/ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar > does not exist. java.io.FileNotFoundException: File > file:/home/oleg/spark/spark12/ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar > does not exist. at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:329) at > org.apache.spark.deploy.worker.DriverRunner.org$apache$spark$deploy$worker$DriverRunner$$downloadUserJar(DriverRunner.scala:150) > at > org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:75) So, I'd appreciate any pointers on what is going wrong and some guidance how to deploy jobs from remote client. Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Submitting-jobs-to-Spark-EC2-cluster-remotely-tp21762.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org