Re: Problem of submitting Spark task to cluster from eclipse IDE on Windows

Akhil Das Wed, 23 Dec 2015 00:12:12 -0800

You need to:

1. Make sure your local router have NAT enabled and port forwarded the
networking ports listed here
<http://spark.apache.org/docs/latest/configuration.html#networking>.
2. Make sure on your clusters 7077 is accessible from your local (public)
ip address. You can try telnet 10.20.17.70 7077
3. Set spark.driver.host so that the cluster can connect back to your
machine.




Thanks
Best Regards

On Wed, Dec 23, 2015 at 10:02 AM, superbee84 <holy...@qq.com> wrote:

> Hi All,
>
>    I'm new to Spark. Before I describe the problem, I'd like to let you
> know
> the role of the machines that organize the cluster and the purpose of my
> work. By reading and follwing the instructions and tutorials, I
> successfully
> built up a cluster with 7 CentOS-6.5 machines. I installed Hadoop 2.7.1,
> Spark 1.5.1, Scala 2.10.4 and ZooKeeper 3.4.5 on them. The details are
> listed as below:
>
>
> Host Name  |  IP Address  |  Hadoop 2.7.1         | Spark 1.5.1        |
> ZooKeeper
> hadoop00   | 10.20.17.70  | NameNode(Active)   | Master(Active)   |   none
> hadoop01   | 10.20.17.71  | NameNode(Standby)| Master(Standby) |   none
> hadoop02   | 10.20.17.72  | ResourceManager(Active)| none          |   none
> hadoop03   | 10.20.17.73  | ResourceManager(Standby)| none        |  none
> hadoop04   | 10.20.17.74  | DataNode              |  Worker              |
> JournalNode
> hadoop05   | 10.20.17.75  | DataNode              |  Worker              |
> JournalNode
> hadoop06   | 10.20.17.76  | DataNode              |  Worker              |
> JournalNode
>
>    Now my *purpose* is to develop Hadoop/Spark applications on my own
> computer(IP: 10.20.6.23) and submit them to the remote cluster. As all the
> other guys in our group are in the habit of eclipse on Windows, I'm trying
> to work on this. I have successfully submitted the WordCount MapReduce job
> to YARN and it run smoothly through eclipse and Windows. But when I tried
> to
> run the Spark WordCount, it gives me the following error in the eclipse
> console:
>
> 15/12/23 11:15:30 INFO AppClient$ClientEndpoint: Connecting to master
> spark://10.20.17.70:7077...
> 15/12/23 11:15:50 ERROR SparkUncaughtExceptionHandler: Uncaught exception
> in
> thread Thread[appclient-registration-retry-thread,5,main]
> java.util.concurrent.RejectedExecutionException: Task
> java.util.concurrent.FutureTask@29ed85e7 rejected from
> java.util.concurrent.ThreadPoolExecutor@28f21632[Running, pool size = 1,
> active threads = 0, queued tasks = 0, completed tasks = 1]
>         at
>
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown
> Source)
>         at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
>         at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
>         at java.util.concurrent.AbstractExecutorService.submit(Unknown
> Source)
>         at
>
> org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:96)
>         at
>
> org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:95)
>         at
>
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>         at
>
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>         at
>
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>         at
> scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
>         at
> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>         at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
>         at
>
> org.apache.spark.deploy.client.AppClient$ClientEndpoint.tryRegisterAllMasters(AppClient.scala:95)
>         at
>
> org.apache.spark.deploy.client.AppClient$ClientEndpoint.org$apache$spark$deploy$client$AppClient$ClientEndpoint$$registerWithMaster(AppClient.scala:121)
>         at
>
> org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:132)
>         at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1119)
>         at
>
> org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:124)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
> Source)
>         at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
> Source)
>         at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
> Source)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source)
>         at java.lang.Thread.run(Unknown Source)
> 15/12/23 11:15:50 INFO DiskBlockManager: Shutdown hook called
> 15/12/23 11:15:50 INFO ShutdownHookManager: Shutdown hook called
>
>     Then I checked the Spark Master log, and find the following critical
> statements:
>
> 15/12/23 11:15:33 ERROR ErrorMonitor: dropping message [class
> akka.actor.ActorSelectionMessage] for non-local recipient
> [Actor[akka.tcp://sparkMaster@10.20.17.70:7077/]] arriving at
> [akka.tcp://sparkMaster@10.20.17.70:7077] inbound addresses are
> [akka.tcp://sparkMaster@hadoop00:7077]
> akka.event.Logging$Error$NoCause$
> 15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated,
> removing
> it.
> 15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated,
> removing
> it.
> 15/12/23 11:15:53 WARN ReliableDeliverySupervisor: Association with remote
> system [akka.tcp://sparkDriver@10.20.6.23:56374] has failed, address is
> now
> gated for [5000] ms. Reason: [Disassociated]
>
>     Here's my Scala code:
>
>    object WordCount{
>   def main(args: Array[String]){
>     val conf = new SparkConf().setAppName("Scala
> WordCount").setMaster("spark://10.20.17.70:7077
> ").setJars(List("C:\\Temp\\test.jar"));
>     val sc = new SparkContext(conf);
>     val textFile = sc.textFile("hdfs://10.20.17.70:9000/wc/indata/wht.txt
> ");
>     textFile.flatMap(_.split(" ")).map((_,
> 1)).reduceByKey(_+_).collect().foreach(println);
>   }
> }
>
>     To solve the problem, I tried the following:
>
>     (1) run spark-shell to check the Scala version, and proved that to be
> 2.10.4 and compatible with the eclipse-scala plugin.
>     (2) run spark-submit on the SparkPi examle by specifying the --master
> param to "10.20.17.70:7077", and it successfully worked out the result. I
> was also able to see the application history on the Master's Web UI.
>     (3) I turned off the firewall on my Windows machine.
>
>     Unfortunately, the error message remains. Could anybody give me some
> suggestions ? Thanks very much!
>
> Yours Sincerely,
> Yefeng
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Problem-of-submitting-Spark-task-to-cluster-from-eclipse-IDE-on-Windows-tp25778.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Problem of submitting Spark task to cluster from eclipse IDE on Windows

Reply via email to