Hi Hokam,
Thank you very much. Your approach really works after I set hostname/IP in
the Windows hosts file. However, new error information comes out. I think it's
very common as I have seen such information in many places.
Here's part of information from Eclipse console.
15/12/24 11:59:08 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20151224105757-0002/92 on hostPort 10.20.17.74:44097 with 4 cores, 1024.0
MB RAM
15/12/24 11:59:08 INFO AppClient$ClientEndpoint: Executor updated:
app-20151224105757-0002/92 is now LOADING
15/12/24 11:59:08 INFO AppClient$ClientEndpoint: Executor updated:
app-20151224105757-0002/92 is now RUNNING
15/12/24 11:59:12 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/12/24 11:59:27 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/12/24 11:59:42 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/12/24 11:59:57 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/12/24 12:00:12 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/12/24 12:00:27 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/12/24 12:00:42 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/12/24 12:00:57 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor updated:
app-20151224105757-0002/90 is now EXITED (Command exited with code 1)
15/12/24 12:01:08 INFO SparkDeploySchedulerBackend: Executor
app-20151224105757-0002/90 removed: Command exited with code 1
15/12/24 12:01:08 INFO SparkDeploySchedulerBackend: Asked to remove
non-existent executor 90
15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor added:
app-20151224105757-0002/93 on worker-20151221140040-10.20.17.76-33817
(10.20.17.76:33817) with 4 cores
15/12/24 12:01:08 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20151224105757-0002/93 on hostPort 10.20.17.76:33817 with 4 cores, 1024.0
MB RAM
15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor updated:
app-20151224105757-0002/93 is now LOADING
15/12/24 12:01:08 INFO AppClient$ClientEndpoint: Executor updated:
app-20151224105757-0002/93 is now RUNNING
15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor updated:
app-20151224105757-0002/91 is now EXITED (Command exited with code 1)
15/12/24 12:01:09 INFO SparkDeploySchedulerBackend: Executor
app-20151224105757-0002/91 removed: Command exited with code 1
15/12/24 12:01:09 INFO SparkDeploySchedulerBackend: Asked to remove
non-existent executor 91
15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor added:
app-20151224105757-0002/94 on worker-20151221140040-10.20.17.75-47807
(10.20.17.75:47807) with 4 cores
15/12/24 12:01:09 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20151224105757-0002/94 on hostPort 10.20.17.75:47807 with 4 cores, 1024.0
MB RAM
15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor updated:
app-20151224105757-0002/94 is now LOADING
15/12/24 12:01:09 INFO AppClient$ClientEndpoint: Executor updated:
app-20151224105757-0002/94 is now RUNNING
15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor updated:
app-20151224105757-0002/92 is now EXITED (Command exited with code 1)
15/12/24 12:01:10 INFO SparkDeploySchedulerBackend: Executor
app-20151224105757-0002/92 removed: Command exited with code 1
15/12/24 12:01:10 INFO SparkDeploySchedulerBackend: Asked to remove
non-existent executor 92
15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor added:
app-20151224105757-0002/95 on worker-20151221193318-10.20.17.74-44097
(10.20.17.74:44097) with 4 cores
15/12/24 12:01:10 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20151224105757-0002/95 on hostPort 10.20.17.74:44097 with 4 cores, 1024.0
MB RAM
15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor updated:
app-20151224105757-0002/95 is now LOADING
15/12/24 12:01:10 INFO AppClient$ClientEndpoint: Executor updated:
app-20151224105757-0002/95 is now RUNNING
15/12/24 12:01:12 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/12/24 12:01:27 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
15/12/24 12:01:42 WARN TaskSchedulerImpl: Initial job has not accepted any
resources; check your cluster UI to ensure that workers are registered and have
sufficient resources
...
The logs in Spark's master UI shows that each Worker has such exception:
15/12/24 17:17:29 INFO Utils: Successfully started service 'driverPropsFetcher'
on port 50576. Exception in thread "main"
java.lang.reflect.UndeclaredThrowableException at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643)
at
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:68)
at
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:149)
at
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:250)
at
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.rpc.RpcTimeoutException: Futures timed out after
[120 seconds]. This timeout is controlled by spark.rpc.lookupTimeout at
org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214)
at
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:229)
at
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:225)
at
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:242) at
org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:98) at
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:162)
at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69)
at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)
at java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:415) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
... 4 more Caused by: java.util.concurrent.TimeoutException: Futures
timed out after [120 seconds] at
scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at
scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) at
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107) at
org.apache.spark.rpc.RpcTimeout.awaitResult(RpcEnv.scala:241) ... 11
more
I think it's still network connection problem, but not far from victory.
I'll continue to try Akhil's approach to set up NAT and port forward (I haven't
got contact with the guy who's in charge of this yet). If you have better
suggestions, please let me know as soon as possible.
Best Regards,
Yefeng
------------------ ???????? ------------------
??????: "Hokam Singh Chauhan";<[email protected]>;
????????: 2015??12??24??(??????) ????10:54
??????: "Akhil Das"<[email protected]>;
????: "user"<[email protected]>; "????????????"<[email protected]>;
????: Re: Problem of submitting Spark task to cluster from eclipse IDE on
Windows
Hi,
Use spark://hostname:7077 as spark master if you are using IP address in place
of hostname.
I have faced the same issue, it got resolved by using hostname in spark master
instead of using IP address.
Regards,
Hokam
On 23 Dec 2015 13:41, "Akhil Das" <[email protected]> wrote:
You need to:
1. Make sure your local router have NAT enabled and port forwarded the
networking ports listed here.
2. Make sure on your clusters 7077 is accessible from your local (public) ip
address. You can try telnet 10.20.17.70 7077
3. Set spark.driver.host so that the cluster can connect back to your machine.
ThanksBest Regards
On Wed, Dec 23, 2015 at 10:02 AM, superbee84 <[email protected]> wrote:
Hi All,
I'm new to Spark. Before I describe the problem, I'd like to let you know
the role of the machines that organize the cluster and the purpose of my
work. By reading and follwing the instructions and tutorials, I successfully
built up a cluster with 7 CentOS-6.5 machines. I installed Hadoop 2.7.1,
Spark 1.5.1, Scala 2.10.4 and ZooKeeper 3.4.5 on them. The details are
listed as below:
Host Name | IP Address | Hadoop 2.7.1 | Spark 1.5.1 |
ZooKeeper
hadoop00 | 10.20.17.70 | NameNode(Active) | Master(Active) | none
hadoop01 | 10.20.17.71 | NameNode(Standby)| Master(Standby) | none
hadoop02 | 10.20.17.72 | ResourceManager(Active)| none | none
hadoop03 | 10.20.17.73 | ResourceManager(Standby)| none | none
hadoop04 | 10.20.17.74 | DataNode | Worker |
JournalNode
hadoop05 | 10.20.17.75 | DataNode | Worker |
JournalNode
hadoop06 | 10.20.17.76 | DataNode | Worker |
JournalNode
Now my *purpose* is to develop Hadoop/Spark applications on my own
computer(IP: 10.20.6.23) and submit them to the remote cluster. As all the
other guys in our group are in the habit of eclipse on Windows, I'm trying
to work on this. I have successfully submitted the WordCount MapReduce job
to YARN and it run smoothly through eclipse and Windows. But when I tried to
run the Spark WordCount, it gives me the following error in the eclipse
console:
15/12/23 11:15:30 INFO AppClient$ClientEndpoint: Connecting to master
spark://10.20.17.70:7077...
15/12/23 11:15:50 ERROR SparkUncaughtExceptionHandler: Uncaught exception in
thread Thread[appclient-registration-retry-thread,5,main]
java.util.concurrent.RejectedExecutionException: Task
java.util.concurrent.FutureTask@29ed85e7 rejected from
java.util.concurrent.ThreadPoolExecutor@28f21632[Running, pool size = 1,
active threads = 0, queued tasks = 0, completed tasks = 1]
at
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
at java.util.concurrent.AbstractExecutorService.submit(Unknown Source)
at
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:96)
at
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:95)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at
scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at
org.apache.spark.deploy.client.AppClient$ClientEndpoint.tryRegisterAllMasters(AppClient.scala:95)
at
org.apache.spark.deploy.client.AppClient$ClientEndpoint.org$apache$spark$deploy$client$AppClient$ClientEndpoint$$registerWithMaster(AppClient.scala:121)
at
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:132)
at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1119)
at
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:124)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
Source)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
15/12/23 11:15:50 INFO DiskBlockManager: Shutdown hook called
15/12/23 11:15:50 INFO ShutdownHookManager: Shutdown hook called
Then I checked the Spark Master log, and find the following critical
statements:
15/12/23 11:15:33 ERROR ErrorMonitor: dropping message [class
akka.actor.ActorSelectionMessage] for non-local recipient
[Actor[akka.tcp://[email protected]:7077/]] arriving at
[akka.tcp://[email protected]:7077] inbound addresses are
[akka.tcp://sparkMaster@hadoop00:7077]
akka.event.Logging$Error$NoCause$
15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated, removing
it.
15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated, removing
it.
15/12/23 11:15:53 WARN ReliableDeliverySupervisor: Association with remote
system [akka.tcp://[email protected]:56374] has failed, address is now
gated for [5000] ms. Reason: [Disassociated]
Here's my Scala code:
object WordCount{
def main(args: Array[String]){
val conf = new SparkConf().setAppName("Scala
WordCount").setMaster("spark://10.20.17.70:7077").setJars(List("C:\\Temp\\test.jar"));
val sc = new SparkContext(conf);
val textFile = sc.textFile("hdfs://10.20.17.70:9000/wc/indata/wht.txt");
textFile.flatMap(_.split(" ")).map((_,
1)).reduceByKey(_+_).collect().foreach(println);
}
}
To solve the problem, I tried the following:
(1) run spark-shell to check the Scala version, and proved that to be
2.10.4 and compatible with the eclipse-scala plugin.
(2) run spark-submit on the SparkPi examle by specifying the --master
param to "10.20.17.70:7077", and it successfully worked out the result. I
was also able to see the application history on the Master's Web UI.
(3) I turned off the firewall on my Windows machine.
Unfortunately, the error message remains. Could anybody give me some
suggestions ? Thanks very much!
Yours Sincerely,
Yefeng
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Problem-of-submitting-Spark-task-to-cluster-from-eclipse-IDE-on-Windows-tp25778.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]