?????? Problem of submitting Spark task to cluster from eclipse IDE on Windows

???????????? Wed, 23 Dec 2015 20:19:51 -0800

Hi Hokam,


    Thank you very much. Your approach really works after I set hostname/IP in 
the Windows hosts file. However, new error information comes out. I think it's 
very common as I have seen such information in many places. 
    Here's part of information from Eclipse console.


    15/12/24 11:59:08 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20151224105757-0002/92 on hostPort 10.20.17.74:44097 with 4 cores, 1024.0 
MB RAM
15/12/24 11:59:08 INFO AppClient$ClientEndpoint: Executor updated: 
app-20151224105757-0002/92 is now LOADING
15/12/24 11:59:08 INFO AppClient$ClientEndpoint: Executor updated: 
app-20151224105757-0002/92 is now RUNNING
15/12/24 11:59:12 WARN TaskSchedulerImpl: Initial job has not accepted any 
resources; check your cluster UI to ensure that workers are registered and have 
sufficient resources
15/12/24 11:59:27 WARN TaskSchedulerImpl: Initial job has not accepted any 
resources; check your cluster UI to ensure that workers are registered and have 
sufficient resources
15/12/24 11:59:42 WARN TaskSchedulerImpl: Initial job has not accepted any 
resources; check your cluster UI to ensure that workers are registered and have 
sufficient resources
15/12/24 11:59:57 WARN TaskSchedulerImpl: Initial job has not accepted any 
resources; check your cluster UI to ensure that workers are registered and have 
sufficient resources
15/12/24 12:00:12 WARN TaskSchedulerImpl: Initial job has not accepted any 
resources; check your cluster UI to ensure that workers are registered and have 
sufficient resources
15/12/24 12:00:27 WARN TaskSchedulerImpl: Initial job has not accepted any 
resources; check your cluster UI to ensure that workers are registered and have 
sufficient resources
15/12/24 12:00:42 WARN TaskSchedulerImpl: Initial job has not accepted any 
resources; check your cluster UI to ensure that workers are registered and have 
sufficient resources
15/12/24 12:00:57 WARN TaskSchedulerImpl: Initial job has not accepted any 
resources; check your cluster UI to ensure that workers are registered and have 
sufficient resources
15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor updated: 
app-20151224105757-0002/90 is now EXITED (Command exited with code 1)
15/12/24 12:01:08 INFO SparkDeploySchedulerBackend: Executor 
app-20151224105757-0002/90 removed: Command exited with code 1
15/12/24 12:01:08 INFO SparkDeploySchedulerBackend: Asked to remove 
non-existent executor 90
15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor added: 
app-20151224105757-0002/93 on worker-20151221140040-10.20.17.76-33817 
(10.20.17.76:33817) with 4 cores
15/12/24 12:01:08 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20151224105757-0002/93 on hostPort 10.20.17.76:33817 with 4 cores, 1024.0 
MB RAM
15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor updated: 
app-20151224105757-0002/93 is now LOADING
15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor updated: 
app-20151224105757-0002/93 is now RUNNING
15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor updated: 
app-20151224105757-0002/91 is now EXITED (Command exited with code 1)
15/12/24 12:01:09 INFO SparkDeploySchedulerBackend: Executor 
app-20151224105757-0002/91 removed: Command exited with code 1
15/12/24 12:01:09 INFO SparkDeploySchedulerBackend: Asked to remove 
non-existent executor 91
15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor added: 
app-20151224105757-0002/94 on worker-20151221140040-10.20.17.75-47807 
(10.20.17.75:47807) with 4 cores
15/12/24 12:01:09 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20151224105757-0002/94 on hostPort 10.20.17.75:47807 with 4 cores, 1024.0 
MB RAM
15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor updated: 
app-20151224105757-0002/94 is now LOADING
15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor updated: 
app-20151224105757-0002/94 is now RUNNING
15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor updated: 
app-20151224105757-0002/92 is now EXITED (Command exited with code 1)
15/12/24 12:01:10 INFO SparkDeploySchedulerBackend: Executor 
app-20151224105757-0002/92 removed: Command exited with code 1
15/12/24 12:01:10 INFO SparkDeploySchedulerBackend: Asked to remove 
non-existent executor 92
15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor added: 
app-20151224105757-0002/95 on worker-20151221193318-10.20.17.74-44097 
(10.20.17.74:44097) with 4 cores
15/12/24 12:01:10 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20151224105757-0002/95 on hostPort 10.20.17.74:44097 with 4 cores, 1024.0 
MB RAM
15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor updated: 
app-20151224105757-0002/95 is now LOADING
15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor updated: 
app-20151224105757-0002/95 is now RUNNING
15/12/24 12:01:12 WARN TaskSchedulerImpl: Initial job has not accepted any 
resources; check your cluster UI to ensure that workers are registered and have 
sufficient resources
15/12/24 12:01:27 WARN TaskSchedulerImpl: Initial job has not accepted any 
resources; check your cluster UI to ensure that workers are registered and have 
sufficient resources
15/12/24 12:01:42 WARN TaskSchedulerImpl: Initial job has not accepted any 
resources; check your cluster UI to ensure that workers are registered and have 
sufficient resources
...
 
      The logs in Spark's master UI shows that each Worker has such exception:
15/12/24 17:17:29 INFO Utils: Successfully started service 'driverPropsFetcher' 
on port 50576. Exception in thread "main" 
java.lang.reflect.UndeclaredThrowableException        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643)
         at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:68)
     at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:149)
  at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:250)
         at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
 Caused by: org.apache.spark.rpc.RpcTimeoutException: Futures timed out after 
[120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout    at 
org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214)
         at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:229)
         at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:225)
         at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)   
     at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:242)        at 
org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:98)   at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:162)
   at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69)   
     at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)   
     at java.security.AccessController.doPrivileged(Native Method)   at 
javax.security.auth.Subject.doAs(Subject.java:415)   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
         ... 4 more Caused by: java.util.concurrent.TimeoutException: Futures 
timed out after [120 seconds]      at 
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)        at 
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)       at 
scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)    at 
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
    at scala.concurrent.Await$.result(package.scala:107)    at 
org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:241)        ... 11 
more 

      I think it's still network connection problem, but not far from victory. 
I'll continue to try Akhil's approach to set up NAT and port forward (I haven't 
got contact with the guy who's in charge of this yet). If you have better 
suggestions, please let me know as soon as possible.


Best Regards,
Yefeng
      
    




------------------ ???????? ------------------
??????: "Hokam Singh Chauhan";<hokam.1...@gmail.com>;
????????: 2015??12??24??(??????) ????10:54
??????: "Akhil Das"<ak...@sigmoidanalytics.com>; 
????: "user"<user@spark.apache.org>; "????????????"<holy...@qq.com>; 
????: Re: Problem of submitting Spark task to cluster from eclipse IDE on 
Windows




Hi,
 
Use spark://hostname:7077 as spark master if you are using IP address in place 
of hostname.
 
I have faced the same issue, it got resolved by using hostname in spark master 
instead of using IP address.
 
Regards,
 Hokam
 On 23 Dec 2015 13:41, "Akhil Das" <ak...@sigmoidanalytics.com> wrote:
You need to:


1. Make sure your local router have NAT enabled and port forwarded the 
networking ports listed here.
2. Make sure on your clusters 7077 is accessible from your local (public) ip 
address. You can try telnet 10.20.17.70 7077
3. Set spark.driver.host so that the cluster can connect back to your machine.






ThanksBest Regards



 
On Wed, Dec 23, 2015 at 10:02 AM, superbee84 <holy...@qq.com> wrote:
Hi All,
 
    I'm new to Spark. Before I describe the problem, I'd like to let you know
 the role of the machines that organize the cluster and the purpose of my
 work. By reading and follwing the instructions and tutorials, I successfully
 built up a cluster with 7 CentOS-6.5 machines. I installed Hadoop 2.7.1,
 Spark 1.5.1, Scala 2.10.4 and ZooKeeper 3.4.5 on them. The details are
 listed as below:
 
 
 Host Name  |  IP Address  |  Hadoop 2.7.1         | Spark 1.5.1        |
 ZooKeeper
 hadoop00   | 10.20.17.70  | NameNode(Active)   | Master(Active)   |   none
 hadoop01   | 10.20.17.71  | NameNode(Standby)| Master(Standby) |   none
 hadoop02   | 10.20.17.72  | ResourceManager(Active)| none          |   none
 hadoop03   | 10.20.17.73  | ResourceManager(Standby)| none        |  none
 hadoop04   | 10.20.17.74  | DataNode              |  Worker              |
 JournalNode
 hadoop05   | 10.20.17.75  | DataNode              |  Worker              |
 JournalNode
 hadoop06   | 10.20.17.76  | DataNode              |  Worker              |
 JournalNode
 
    Now my *purpose* is to develop Hadoop/Spark applications on my own
 computer(IP: 10.20.6.23) and submit them to the remote cluster. As all the
 other guys in our group are in the habit of eclipse on Windows, I'm trying
 to work on this. I have successfully submitted the WordCount MapReduce job
 to YARN and it run smoothly through eclipse and Windows. But when I tried to
 run the Spark WordCount, it gives me the following error in the eclipse
 console:
 
 15/12/23 11:15:30 INFO AppClient$ClientEndpoint: Connecting to master
 spark://10.20.17.70:7077...
 15/12/23 11:15:50 ERROR SparkUncaughtExceptionHandler: Uncaught exception in
 thread Thread[appclient-registration-retry-thread,5,main]
 java.util.concurrent.RejectedExecutionException: Task
 java.util.concurrent.FutureTask@29ed85e7 rejected from
 java.util.concurrent.ThreadPoolExecutor@28f21632[Running, pool size = 1,
 active threads = 0, queued tasks = 0, completed tasks = 1]
         at
 java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown
 Source)
         at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
         at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
         at java.util.concurrent.AbstractExecutorService.submit(Unknown Source)
         at
 
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:96)
         at
 
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:95)
         at
 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
         at
 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
         at
 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
         at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
         at 
scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
         at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
         at
 
org.apache.spark.deploy.client.AppClient$ClientEndpoint.tryRegisterAllMasters(AppClient.scala:95)
         at
 
org.apache.spark.deploy.client.AppClient$ClientEndpoint.org$apache$spark$deploy$client$AppClient$ClientEndpoint$$registerWithMaster(AppClient.scala:121)
         at
 
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:132)
         at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1119)
         at
 
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:124)
         at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
         at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
         at
 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
 Source)
         at
 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
 Source)
         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
         at java.lang.Thread.run(Unknown Source)
 15/12/23 11:15:50 INFO DiskBlockManager: Shutdown hook called
 15/12/23 11:15:50 INFO ShutdownHookManager: Shutdown hook called
 
     Then I checked the Spark Master log, and find the following critical
 statements:
 
 15/12/23 11:15:33 ERROR ErrorMonitor: dropping message [class
 akka.actor.ActorSelectionMessage] for non-local recipient
 [Actor[akka.tcp://sparkMaster@10.20.17.70:7077/]] arriving at
 [akka.tcp://sparkMaster@10.20.17.70:7077] inbound addresses are
 [akka.tcp://sparkMaster@hadoop00:7077]
 akka.event.Logging$Error$NoCause$
 15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated, removing
 it.
 15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated, removing
 it.
 15/12/23 11:15:53 WARN ReliableDeliverySupervisor: Association with remote
 system [akka.tcp://sparkDriver@10.20.6.23:56374] has failed, address is now
 gated for [5000] ms. Reason: [Disassociated]
 
     Here's my Scala code:
 
    object WordCount{
   def main(args: Array[String]){
     val conf = new SparkConf().setAppName("Scala
 
WordCount").setMaster("spark://10.20.17.70:7077").setJars(List("C:\\Temp\\test.jar"));
     val sc = new SparkContext(conf);
     val textFile = sc.textFile("hdfs://10.20.17.70:9000/wc/indata/wht.txt");
     textFile.flatMap(_.split(" ")).map((_,
 1)).reduceByKey(_+_).collect().foreach(println);
   }
 }
 
     To solve the problem, I tried the following:
 
     (1) run spark-shell to check the Scala version, and proved that to be
 2.10.4 and compatible with the eclipse-scala plugin.
     (2) run spark-submit on the SparkPi examle by specifying the --master
 param to "10.20.17.70:7077", and it successfully worked out the result. I
 was also able to see the application history on the Master's Web UI.
     (3) I turned off the firewall on my Windows machine.
 
     Unfortunately, the error message remains. Could anybody give me some
 suggestions ? Thanks very much!
 
 Yours Sincerely,
 Yefeng
 
 
 
 --
 View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Problem-of-submitting-Spark-task-to-cluster-from-eclipse-IDE-on-Windows-tp25778.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.
 
 ---------------------------------------------------------------------
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org

?????? Problem of submitting Spark task to cluster from eclipse IDE on Windows

Reply via email to