I've spent the last 3 days trying to get a connection to YARN from spark on
a single box to work through examples. I'm at a loss.

It's a dual core box, running Jessie Debian. I've tried both Java 7 and
Java 8 from Oracle. It has Hadoop 2.7 installed and YARN running. Scala
version 2.10.4, and Spark version 1.4.0.

It works fine running `spark-shell --master local[*]`

However, running `spark-shell --master yarn-client` gives the following
warnings, which leads to the failure:

----------------------------------------

15/06/21 11:03:16 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint:
ApplicationMaster has disassociated: 10.0.1.201:59881

15/06/21 11:03:16 WARN remote.ReliableDeliverySupervisor: Association with
remote system [akka.tcp://sparkYarnAM@10.0.1.201:59881] has failed, address
is now gated for [5000] ms. Reason is: [Disassociated].

15/06/21 11:03:16 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint:
ApplicationMaster has disassociated: 10.0.1.201:59881

15/06/21 11:03:19 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint:
ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://
sparkYarnAM@10.0.1.201:53719/user/YarnAM#-688478431])

15/06/21 11:03:19 INFO cluster.YarnClientSchedulerBackend: Add WebUI
Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter,
Map(PROXY_HOSTS -> KoolKatz, PROXY_URI_BASES ->
http://KoolKatz:8088/proxy/application_1434751301309_0015),
/proxy/application_1434751301309_0015

15/06/21 11:03:19 INFO ui.JettyUtils: Adding filter:
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter

15/06/21 11:03:22 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint:
ApplicationMaster has disassociated: 10.0.1.201:53719

15/06/21 11:03:22 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint:
ApplicationMaster has disassociated: 10.0.1.201:53719

15/06/21 11:03:22 WARN remote.ReliableDeliverySupervisor: Association with
remote system [akka.tcp://sparkYarnAM@10.0.1.201:53719] has failed, address
is now gated for [5000] ms. Reason is: [Disassociated].

15/06/21 11:03:22 ERROR cluster.YarnClientSchedulerBackend: Yarn
application has already exited with state FINISHED!
--------------------------------------

Looking at the Hadoop logs I see this:

--------------------------------------

2015-06-21 11:03:22,028 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
Process tree for container: container_1434751301309_0015_02_000001 has
processes older than 1 iteration running over the configured limit.
Limit=2254857728, current usage = 2296930304
2015-06-21 11:03:22,029 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
Container [pid=39288,containerID=container_1434751301309_0015_02_000001]
is running beyond virtual memory limits. Current usage: 240.2 MB of 1
GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing
container.
Dump of the process-tree for container_1434751301309_0015_02_000001 :
        |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES)
FULL_CMD_LINE
        |- 39288 39286 39288 39288 (bash) 0 0 13590528 709 /bin/bash -c
/usr/lib/jvm/java-8-oracle/bin/java -server -Xmx512m
-Djava.io.tmpdir=/var/hadoop/tmp/nm-local-dir/usercache/spgarbet/appcache/application_1434751301309_0015/container_1434751301309_0015_02_000001/tmp
'-Dspark.driver.appUIAddress=http://10.0.1.201:4040'
'-Dspark.app.name=Spark shell'
'-Dspark.yarn.executor.memoryOverhead=2048'
'-Dspark.master=yarn-client' '-Dspark.driver.port=46745'
'-Dspark.repl.class.uri=http://10.0.1.201:49759'
'-Dspark.driver.host=10.0.1.201' '-Dspark.executor.id=driver'
'-Dspark.externalBlockStore.folderName=spark-faf3dbe8-612d-47b0-89ed-0aa9993a6c32'
'-Dspark.jars=' '-Dspark.home=/home/spgarbet/spark-1.4.0'
'-Dspark.fileserver.uri=http://10.0.1.201:33063'
-Dspark.yarn.app.container.log.dir=/usr/local/hadoop-2.7.0/logs/userlogs/application_1434751301309_0015/container_1434751301309_0015_02_000001
org.apache.spark.deploy.yarn.ExecutorLauncher --arg '10.0.1.201:46745'
--executor-memory 1024m --executor-cores 1 --num-executors  2 1>
/usr/local/hadoop-2.7.0/logs/userlogs/application_1434751301309_0015/container_1434751301309_0015_02_000001/stdout
2> 
/usr/local/hadoop-2.7.0/logs/userlogs/application_1434751301309_0015/container_1434751301309_0015_02_000001/stderr
        |- 39292 39288 39288 39288 (java) 729 24 2283339776 60789
/usr/lib/jvm/java-8-oracle/bin/java -server -Xmx512m
-Djava.io.tmpdir=/var/hadoop/tmp/nm-local-dir/usercache/spgarbet/appcache/application_1434751301309_0015/container_1434751301309_0015_02_000001/tmp
-Dspark.driver.appUIAddress=http://10.0.1.201:4040
-Dspark.app.name=Spark shell -Dspark.yarn.executor.memoryOverhead=2048
-Dspark.master=yarn-client -Dspark.driver.port=46745
-Dspark.repl.class.uri=http://10.0.1.201:49759
-Dspark.driver.host=10.0.1.201 -Dspark.executor.id=driver
-Dspark.externalBlockStore.folderName=spark-faf3dbe8-612d-47b0-89ed-0aa9993a6c32
-Dspark.jars= -Dspark.home=/home/spgarbet/spark-1.4.0
-Dspark.fileserver.uri=http://10.0.1.201:33063
-Dspark.yarn.app.container.log.dir=/usr/local/hadoop-2.7.0/logs/userlogs/application_1434751301309_0015/container_1434751301309_0015_02_000001
org.apache.spark.deploy.yarn.ExecutorLauncher --arg 10.0.1.201:46745
--executor-memory 1024m --executor-cores 1 --num-executors 2

------------------------------------

Any ideas?

-- 
Shawn Garbett

Reply via email to