Hi Hokam,
Thank you very much. Your approach really works after I set hostname/IP in the Windows hosts file. However, new error information comes out. I think it's very common as I have seen such information in many places. Here's part of information from Eclipse console. 15/12/24 11:59:08 INFO SparkDeploySchedulerBackend: Granted executor ID app-20151224105757-0002/92 on hostPort 10.20.17.74:44097 with 4 cores, 1024.0 MB RAM 15/12/24 11:59:08 INFO AppClient$ClientEndpoint: Executor updated: app-20151224105757-0002/92 is now LOADING 15/12/24 11:59:08 INFO AppClient$ClientEndpoint: Executor updated: app-20151224105757-0002/92 is now RUNNING 15/12/24 11:59:12 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 15/12/24 11:59:27 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 15/12/24 11:59:42 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 15/12/24 11:59:57 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 15/12/24 12:00:12 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 15/12/24 12:00:27 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 15/12/24 12:00:42 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 15/12/24 12:00:57 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor updated: app-20151224105757-0002/90 is now EXITED (Command exited with code 1) 15/12/24 12:01:08 INFO SparkDeploySchedulerBackend: Executor app-20151224105757-0002/90 removed: Command exited with code 1 15/12/24 12:01:08 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 90 15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor added: app-20151224105757-0002/93 on worker-20151221140040-10.20.17.76-33817 (10.20.17.76:33817) with 4 cores 15/12/24 12:01:08 INFO SparkDeploySchedulerBackend: Granted executor ID app-20151224105757-0002/93 on hostPort 10.20.17.76:33817 with 4 cores, 1024.0 MB RAM 15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor updated: app-20151224105757-0002/93 is now LOADING 15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor updated: app-20151224105757-0002/93 is now RUNNING 15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor updated: app-20151224105757-0002/91 is now EXITED (Command exited with code 1) 15/12/24 12:01:09 INFO SparkDeploySchedulerBackend: Executor app-20151224105757-0002/91 removed: Command exited with code 1 15/12/24 12:01:09 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 91 15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor added: app-20151224105757-0002/94 on worker-20151221140040-10.20.17.75-47807 (10.20.17.75:47807) with 4 cores 15/12/24 12:01:09 INFO SparkDeploySchedulerBackend: Granted executor ID app-20151224105757-0002/94 on hostPort 10.20.17.75:47807 with 4 cores, 1024.0 MB RAM 15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor updated: app-20151224105757-0002/94 is now LOADING 15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor updated: app-20151224105757-0002/94 is now RUNNING 15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor updated: app-20151224105757-0002/92 is now EXITED (Command exited with code 1) 15/12/24 12:01:10 INFO SparkDeploySchedulerBackend: Executor app-20151224105757-0002/92 removed: Command exited with code 1 15/12/24 12:01:10 INFO SparkDeploySchedulerBackend: Asked to remove non-existent executor 92 15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor added: app-20151224105757-0002/95 on worker-20151221193318-10.20.17.74-44097 (10.20.17.74:44097) with 4 cores 15/12/24 12:01:10 INFO SparkDeploySchedulerBackend: Granted executor ID app-20151224105757-0002/95 on hostPort 10.20.17.74:44097 with 4 cores, 1024.0 MB RAM 15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor updated: app-20151224105757-0002/95 is now LOADING 15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor updated: app-20151224105757-0002/95 is now RUNNING 15/12/24 12:01:12 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 15/12/24 12:01:27 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources 15/12/24 12:01:42 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources ... The logs in Spark's master UI shows that each Worker has such exception: 15/12/24 17:17:29 INFO Utils: Successfully started service 'driverPropsFetcher' on port 50576. Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:68) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:149) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:250) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) Caused by: org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:229) at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:225) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:242) at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:98) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:162) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) ... 4 more Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:107) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:241) ... 11 more I think it's still network connection problem, but not far from victory. I'll continue to try Akhil's approach to set up NAT and port forward (I haven't got contact with the guy who's in charge of this yet). If you have better suggestions, please let me know as soon as possible. Best Regards, Yefeng ------------------ ???????? ------------------ ??????: "Hokam Singh Chauhan";<hokam.1...@gmail.com>; ????????: 2015??12??24??(??????) ????10:54 ??????: "Akhil Das"<ak...@sigmoidanalytics.com>; ????: "user"<user@spark.apache.org>; "????????????"<holy...@qq.com>; ????: Re: Problem of submitting Spark task to cluster from eclipse IDE on Windows Hi, Use spark://hostname:7077 as spark master if you are using IP address in place of hostname. I have faced the same issue, it got resolved by using hostname in spark master instead of using IP address. Regards, Hokam On 23 Dec 2015 13:41, "Akhil Das" <ak...@sigmoidanalytics.com> wrote: You need to: 1. Make sure your local router have NAT enabled and port forwarded the networking ports listed here. 2. Make sure on your clusters 7077 is accessible from your local (public) ip address. You can try telnet 10.20.17.70 7077 3. Set spark.driver.host so that the cluster can connect back to your machine. ThanksBest Regards On Wed, Dec 23, 2015 at 10:02 AM, superbee84 <holy...@qq.com> wrote: Hi All, I'm new to Spark. Before I describe the problem, I'd like to let you know the role of the machines that organize the cluster and the purpose of my work. By reading and follwing the instructions and tutorials, I successfully built up a cluster with 7 CentOS-6.5 machines. I installed Hadoop 2.7.1, Spark 1.5.1, Scala 2.10.4 and ZooKeeper 3.4.5 on them. The details are listed as below: Host Name | IP Address | Hadoop 2.7.1 | Spark 1.5.1 | ZooKeeper hadoop00 | 10.20.17.70 | NameNode(Active) | Master(Active) | none hadoop01 | 10.20.17.71 | NameNode(Standby)| Master(Standby) | none hadoop02 | 10.20.17.72 | ResourceManager(Active)| none | none hadoop03 | 10.20.17.73 | ResourceManager(Standby)| none | none hadoop04 | 10.20.17.74 | DataNode | Worker | JournalNode hadoop05 | 10.20.17.75 | DataNode | Worker | JournalNode hadoop06 | 10.20.17.76 | DataNode | Worker | JournalNode Now my *purpose* is to develop Hadoop/Spark applications on my own computer(IP: 10.20.6.23) and submit them to the remote cluster. As all the other guys in our group are in the habit of eclipse on Windows, I'm trying to work on this. I have successfully submitted the WordCount MapReduce job to YARN and it run smoothly through eclipse and Windows. But when I tried to run the Spark WordCount, it gives me the following error in the eclipse console: 15/12/23 11:15:30 INFO AppClient$ClientEndpoint: Connecting to master spark://10.20.17.70:7077... 15/12/23 11:15:50 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[appclient-registration-retry-thread,5,main] java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@29ed85e7 rejected from java.util.concurrent.ThreadPoolExecutor@28f21632[Running, pool size = 1, active threads = 0, queued tasks = 0, completed tasks = 1] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source) at java.util.concurrent.AbstractExecutorService.submit(Unknown Source) at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:96) at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:95) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.deploy.client.AppClient$ClientEndpoint.tryRegisterAllMasters(AppClient.scala:95) at org.apache.spark.deploy.client.AppClient$ClientEndpoint.org$apache$spark$deploy$client$AppClient$ClientEndpoint$$registerWithMaster(AppClient.scala:121) at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:132) at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1119) at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:124) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) 15/12/23 11:15:50 INFO DiskBlockManager: Shutdown hook called 15/12/23 11:15:50 INFO ShutdownHookManager: Shutdown hook called Then I checked the Spark Master log, and find the following critical statements: 15/12/23 11:15:33 ERROR ErrorMonitor: dropping message [class akka.actor.ActorSelectionMessage] for non-local recipient [Actor[akka.tcp://sparkMaster@10.20.17.70:7077/]] arriving at [akka.tcp://sparkMaster@10.20.17.70:7077] inbound addresses are [akka.tcp://sparkMaster@hadoop00:7077] akka.event.Logging$Error$NoCause$ 15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated, removing it. 15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated, removing it. 15/12/23 11:15:53 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@10.20.6.23:56374] has failed, address is now gated for [5000] ms. Reason: [Disassociated] Here's my Scala code: object WordCount{ def main(args: Array[String]){ val conf = new SparkConf().setAppName("Scala WordCount").setMaster("spark://10.20.17.70:7077").setJars(List("C:\\Temp\\test.jar")); val sc = new SparkContext(conf); val textFile = sc.textFile("hdfs://10.20.17.70:9000/wc/indata/wht.txt"); textFile.flatMap(_.split(" ")).map((_, 1)).reduceByKey(_+_).collect().foreach(println); } } To solve the problem, I tried the following: (1) run spark-shell to check the Scala version, and proved that to be 2.10.4 and compatible with the eclipse-scala plugin. (2) run spark-submit on the SparkPi examle by specifying the --master param to "10.20.17.70:7077", and it successfully worked out the result. I was also able to see the application history on the Master's Web UI. (3) I turned off the firewall on my Windows machine. Unfortunately, the error message remains. Could anybody give me some suggestions ? Thanks very much! Yours Sincerely, Yefeng -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-of-submitting-Spark-task-to-cluster-from-eclipse-IDE-on-Windows-tp25778.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org