Hi guys, I'm planning to use spark on a project and I'm facing a problem, I couldn't find a log that explains what's wrong with what I'm doing.
I have 2 vms that run a small hadoop (2.6.0) cluster. I added a file that has a 50 lines of json data Compiled spark, all tests passed, I run some simple scripts and then a yarn example from the page: ./bin/spark-submit --class org.apache.spark.examples.SparkPi \ --master yarn-cluster \ --num-executors 3 \ --driver-memory 512m \ --executor-memory 512m \ --executor-cores 1 \ lib/spark-examples*.jar \ 10 they all worked, so now It was time to start playing with yarn integration I did ./bin/spark-shell -Dspark-cores-max=1 -Dspark.executor.memory=128m --driver-memory 385m --executor-memory 385m --master yarn-client then *scala> val file = sc.textFile("hdfs:///logs/events.log")* 14/12/11 09:50:35 INFO storage.MemoryStore: ensureFreeSpace(82156) called with curMem=0, maxMem=278302556 14/12/11 09:50:35 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 80.2 KB, free 265.3 MB) 14/12/11 09:50:35 INFO storage.MemoryStore: ensureFreeSpace(17472) called with curMem=82156, maxMem=278302556 14/12/11 09:50:35 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 17.1 KB, free 265.3 MB) 14/12/11 09:50:35 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.17.42.1:53158 (size: 17.1 KB, free: 265.4 MB) 14/12/11 09:50:35 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0 file: org.apache.spark.rdd.RDD[String] = hdfs:///logs/events.log MappedRDD[1] at textFile at <console>:12 *scala> val newsletters = file.filter(line => line.contains("newsletter_id"))* newsletters: org.apache.spark.rdd.RDD[String] = FilteredRDD[2] at filter at <console>:14 *scala> newsletters.count()* 14/12/11 09:50:47 INFO mapred.FileInputFormat: Total input paths to process : 1 14/12/11 09:50:47 INFO spark.SparkContext: Starting job: count at <console>:17 14/12/11 09:50:47 INFO scheduler.DAGScheduler: Got job 0 (count at <console>:17) with 2 output partitions (allowLocal=false) 14/12/11 09:50:47 INFO scheduler.DAGScheduler: Final stage: Stage 0(count at <console>:17) 14/12/11 09:50:47 INFO scheduler.DAGScheduler: Parents of final stage: List() 14/12/11 09:50:47 INFO scheduler.DAGScheduler: Missing parents: List() 14/12/11 09:50:47 INFO scheduler.DAGScheduler: Submitting Stage 0 (FilteredRDD[2] at filter at <console>:14), which has no missing parents 14/12/11 09:50:47 INFO storage.MemoryStore: ensureFreeSpace(2688) called with curMem=99628, maxMem=278302556 14/12/11 09:50:47 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 2.6 KB, free 265.3 MB) 14/12/11 09:50:47 INFO storage.MemoryStore: ensureFreeSpace(1687) called with curMem=102316, maxMem=278302556 14/12/11 09:50:47 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1687.0 B, free 265.3 MB) 14/12/11 09:50:47 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 172.17.42.1:53158 (size: 1687.0 B, free: 265.4 MB) 14/12/11 09:50:47 INFO storage.BlockManagerMaster: Updated info of block broadcast_1_piece0 14/12/11 09:50:47 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from Stage 0 (FilteredRDD[2] at filter at <console>:14) 14/12/11 09:50:47 INFO cluster.YarnClientClusterScheduler: Adding task set 0.0 with 2 tasks 14/12/11 09:50:47 INFO util.RackResolver: Resolved hdfs1 to /default-rack 14/12/11 09:51:02 WARN cluster.YarnClientClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/12/11 09:51:17 WARN cluster.YarnClientClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory On the node executing The spark console shows the task: Active Stages (1)Stage IdDescriptionSubmittedDurationTasks: Succeeded/Total InputShuffle ReadShuffle Write0(kill <http://localhost:4040/stages/stage/kill?id=0&terminate=true>)count at <console>:17 <http://localhost:4040/stages/stage?id=0&attempt=0>+details 2014/12/11 10:18:255.8 min 0/2 In the Hadoop node executing the task: NodeManager informationTotal Vmem allocated for Containers3.10 GBVmem enforcement enabledtrueTotal Pmem allocated for Container1.48 GBPmem enforcement enabledtrueTotal VCores allocated for Containers1 NodeHealthyStatustrueLastNodeHealthTimeThu Dec 11 10:17:49 ART 2014 NodeHealthReportNode Manager Version:2.6.0 from e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1 by jenkins source checksum 7e1415f8c555842b6118a192d86f5e8 on 2014-11-13T21:17ZHadoop Version:2.6.0 from e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1 by jenkins source checksum 18e43357c8f927c0695f1e9522859d6a on 2014-11-13T21:10Z Cluster MetricsApps SubmittedApps PendingApps RunningApps CompletedContainers RunningMemory UsedMemory TotalMemory ReservedVCores UsedVCores TotalVCores ReservedActive NodesDecommissioned NodesLost NodesUnhealthy NodesRebooted Nodes101011 GB2.95 GB0 B1202 <http://hdfs0:8088/cluster/nodes>0 <http://hdfs0:8088/cluster/nodes/decommissioned>0 <http://hdfs0:8088/cluster/nodes/lost>0 <http://hdfs0:8088/cluster/nodes/unhealthy>0 <http://hdfs0:8088/cluster/nodes/rebooted> Show 20406080100 entries Search: ID User Name Application Type Queue StartTime FinishTime State FinalStatus Progress Tracking UI application_1418303626573_0001 <http://hdfs0:8088/cluster/app/application_1418303626573_0001>foteroSpark shellSPARKdefaultThu, 11 Dec 2014 13:18:02 GMTN/ARUNNINGUNDEFINED ApplicationMaster <http://hdfs0:8088/proxy/application_1418303626573_0001/> I got the same results (but seeing the Initial job has not accepted an... on hadoop logs) import org.apache.spark.{SparkConf, SparkContext} /** This is a simple test*/ object TestSpark { def main (args: Array[String]) { System.err.println("Let's start") val sparkConf = new SparkConf().setAppName("TestSpark") val sc = new SparkContext(sparkConf) val file = sc.textFile("hdfs:///logs/events.log") val newsletters = file.filter(line => line.contains("newsletter_id")) val count = newsletters.count() System.err.println("count: " + count) } } *hadoop-2.6.0/logs/userlogs/application_1418303626573_0007/container_1418303626573_0007_01_000002* > cat stderr ... 14/12/11 12:32:09 INFO storage.MemoryStore: ensureFreeSpace(2664) called with curMem=236790, maxMem=280248975 14/12/11 12:32:09 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 2.6 KB, free 267.0 MB) 14/12/11 12:32:09 INFO storage.MemoryStore: ensureFreeSpace(1682) called with curMem=239454, maxMem=280248975 14/12/11 12:32:09 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1682.0 B, free 267.0 MB) 14/12/11 12:32:09 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on hdfs0:51045 (size: 1682.0 B, free: 267.2 MB) 14/12/11 12:32:09 INFO storage.BlockManagerMaster: Updated info of block broadcast_1_piece0 14/12/11 12:32:09 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from Stage 0 (FilteredRDD[2] at filter at TestSpark.scala:10) 14/12/11 12:32:09 INFO cluster.YarnClusterScheduler: Adding task set 0.0 with 2 tasks 14/12/11 12:32:09 INFO util.RackResolver: Resolved hdfs1 to /default-rack 14/12/11 12:32:24 WARN cluster.YarnClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/12/11 12:32:39 WARN cluster.YarnClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/12/11 12:32:54 WARN cluster.YarnClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory Can anyone give me a tip of what I'm missing? :D Thanks!