Just make sure you meet the following: 1. Set spark.driver.host to your local ip (Where you runs your code, and it should be accessible from the cluster)
2. Make sure no firewall/router configurations are blocking/filtering the connection between your laptop and the cluster. Best way to test would be to ping the laptop's public ip from your cluster. (And if the pinging is working, then make sure you are portforwaring the required ports) 3. Also set spark.driver.port if you don't want to open up all the ports on your windows machine (default is random, so stick to one port) A similar discussion already happened here, you can go through it http://apache-spark-user-list.1001560.n3.nabble.com/Submitting-Spark-job-on-Unix-cluster-from-dev-environment-Windows-td16989.html Thanks Best Regards On Mon, Feb 23, 2015 at 1:25 PM, olegshirokikh <o...@solver.com> wrote: > I've set up the EC2 cluster with Spark. Everything works, all master/slaves > are up and running. > > I'm trying to submit a sample job (SparkPi). When I ssh to cluster and > submit it from there - everything works fine. However when driver is > created > on a remote host (my laptop), it doesn't work. I've tried both modes for > `--deploy-mode`: > > **`--deploy-mode=client`:** > > From my laptop: > > ./bin/spark-submit --master > spark://ec2-52-10-82-218.us-west-2.compute.amazonaws.com:7077 --class > SparkPi ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar > > Results in the following indefinite warnings/errors: > > > WARN TaskSchedulerImpl: Initial job has not accepted any resources; > > check your cluster UI to ensure that workers are registered and have > > sufficient memory 15/02/22 18:30:45 > > > ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor > 0 > > 15/02/22 18:30:45 > > > ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor > 1 > > ...and failed drivers - in Spark Web UI "Completed Drivers" with > "State=ERROR" appear. > > I've tried to pass limits for cores and memory to submit script but it > didn't help... > > **`--deploy-mode=cluster`:** > > From my laptop: > > ./bin/spark-submit --master > spark://ec2-52-10-82-218.us-west-2.compute.amazonaws.com:7077 > --deploy-mode > cluster --class SparkPi ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar > > The result is: > > > .... Driver successfully submitted as driver-20150223023734-0007 ... > > waiting before polling master for driver state ... polling master for > > driver state State of driver-20150223023734-0007 is ERROR Exception > > from cluster was: java.io.FileNotFoundException: File > > > file:/home/oleg/spark/spark12/ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar > > does not exist. java.io.FileNotFoundException: File > > > file:/home/oleg/spark/spark12/ec2test/target/scala-2.10/ec2test_2.10-0.0.1.jar > > does not exist. at > > > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397) > > at > > > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251) > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:329) at > > org.apache.spark.deploy.worker.DriverRunner.org > $apache$spark$deploy$worker$DriverRunner$$downloadUserJar(DriverRunner.scala:150) > > at > > > org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:75) > > So, I'd appreciate any pointers on what is going wrong and some guidance > how to deploy jobs from remote client. Thanks. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Submitting-jobs-to-Spark-EC2-cluster-remotely-tp21762.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >