Greetings!
Thanks to all who have taken the time to look at this.
While the process is stalled, I see, in the yarn log on the head node,
repeating messages of the form "Trying to fulfill reservation for application
XXX on node YYY", but that node is is reserved by XXX_000001. Below is a chunk
of the log.
Any suggestions as to what I can investigate?
Thanks!-Mike
2015-02-04 18:18:28,617 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(ResourceManager Event Processor): Trying to fulfill reservation for
application application_1422834185427_0088 on node:
ip-10-171-0-124.ec2.internal:91032015-02-04 18:18:28,617 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
(ResourceManager Event Processor): assignContainers:
node=ip-10-171-0-124.ec2.internal application=88 priority=1 request={Priority:
1, Capability: <memory:8704, vCores:1>, # Containers: 17, Labels: , Location:
*, Relax Locality: true} type=OFF_SWITCH2015-02-04 18:18:28,617 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
(ResourceManager Event Processor): Application application_1422834185427_0088
reserved container container_1422834185427_0088_01_000025 on node host:
ip-10-171-0-124.ec2.internal:9103 #containers=2 available=5632 used=17408,
currently has 6 at priority 1; currentReservation 522242015-02-04 18:18:28,618
INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
(ResourceManager Event Processor): Reserved container
application=application_1422834185427_0088 resource=<memory:8704, vCores:1>
queue=default: capacity=1.0, absoluteCapacity=1.0,
usedResources=<memory:226304, vCores:26>usedCapacity=0.9822222,
absoluteUsedCapacity=0.9822222, numApps=1, numContainers=26
usedCapacity=0.9822222 absoluteUsedCapacity=0.9822222 used=<memory:226304,
vCores:26> cluster=<memory:230400, vCores:160>2015-02-04 18:18:28,618 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(ResourceManager Event Processor): Skipping scheduling since node
ip-10-171-0-124.ec2.internal:9103 is reserved by application
appattempt_1422834185427_0088_0000012015-02-04 18:18:28,623 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
(ResourceManager Event Processor): default usedResources: <memory:226304,
vCores:26> clusterResources: <memory:230400, vCores:160> currentCapacity
0.9822222 required <memory:8704, vCores:1> potentialNewCapacity: 1.02 (
max-capacity: 1.0)2015-02-04 18:18:28,625 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(ResourceManager Event Processor): Trying to fulfill reservation for
application application_1422834185427_0088 on node:
ip-10-171-0-119.ec2.internal:91032015-02-04 18:18:28,625 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
(ResourceManager Event Processor): assignContainers:
node=ip-10-171-0-119.ec2.internal application=88 priority=1 request={Priority:
1, Capability: <memory:8704, vCores:1>, # Containers: 17, Labels: , Location:
*, Relax Locality: true} type=OFF_SWITCH2015-02-04 18:18:28,626 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
(ResourceManager Event Processor): Application application_1422834185427_0088
reserved container container_1422834185427_0088_01_000026 on node host:
ip-10-171-0-119.ec2.internal:9103 #containers=2 available=5632 used=17408,
currently has 6 at priority 1; currentReservation 522242015-02-04 18:18:28,626
INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
(ResourceManager Event Processor): Reserved container
application=application_1422834185427_0088 resource=<memory:8704, vCores:1>
queue=default: capacity=1.0, absoluteCapacity=1.0,
usedResources=<memory:226304, vCores:26>usedCapacity=0.9822222,
absoluteUsedCapacity=0.9822222, numApps=1, numContainers=26
usedCapacity=0.9822222 absoluteUsedCapacity=0.9822222 used=<memory:226304,
vCores:26> cluster=<memory:230400, vCores:160>2015-02-04 18:18:28,626 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(ResourceManager Event Processor): Skipping scheduling since node
ip-10-171-0-119.ec2.internal:9103 is reserved by application
appattempt_1422834185427_0088_0000012015-02-04 18:18:28,636 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(ResourceManager Event Processor): Trying to fulfill reservation for
application application_1422834185427_0088 on node:
ip-10-171-0-129.ec2.internal:91032015-02-04 18:18:28,636 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
(ResourceManager Event Processor): assignContainers:
node=ip-10-171-0-129.ec2.internal application=88 priority=1 request={Priority:
1, Capability: <memory:8704, vCores:1>, # Containers: 17, Labels: , Location:
*, Relax Locality: true} type=OFF_SWITCH2015-02-04 18:18:28,636 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
(ResourceManager Event Processor): Application application_1422834185427_0088
reserved container container_1422834185427_0088_01_000021 on node host:
ip-10-171-0-129.ec2.internal:9103 #containers=2 available=5632 used=17408,
currently has 6 at priority 1; currentReservation 522242015-02-04 18:18:28,636
INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
(ResourceManager Event Processor): Reserved container
application=application_1422834185427_0088 resource=<memory:8704, vCores:1>
queue=default: capacity=1.0, absoluteCapacity=1.0,
usedResources=<memory:226304, vCores:26>usedCapacity=0.9822222,
absoluteUsedCapacity=0.9822222, numApps=1, numContainers=26
usedCapacity=0.9822222 absoluteUsedCapacity=0.9822222 used=<memory:226304,
vCores:26> cluster=<memory:230400, vCores:160>2015-02-04 18:18:28,636 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(ResourceManager Event Processor): Skipping scheduling since node
ip-10-171-0-129.ec2.internal:9103 is reserved by application
appattempt_1422834185427_0088_0000012015-02-04 18:18:28,645 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
(ResourceManager Event Processor): default usedResources: <memory:226304,
vCores:26> clusterResources: <memory:230400, vCores:160> currentCapacity
0.9822222 required <memory:8704, vCores:1> potentialNewCapacity: 1.02 (
max-capacity: 1.0)2015-02-04 18:18:28,646 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(ResourceManager Event Processor): Trying to fulfill reservation for
application application_1422834185427_0088 on node:
ip-10-171-0-122.ec2.internal:91032015-02-04 18:18:28,646 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
(ResourceManager Event Processor): assignContainers:
node=ip-10-171-0-122.ec2.internal application=88 priority=1 request={Priority:
1, Capability: <memory:8704, vCores:1>, # Containers: 17, Labels: , Location:
*, Relax Locality: true} type=OFF_SWITCH2015-02-04 18:18:28,646 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
(ResourceManager Event Processor): Application application_1422834185427_0088
reserved container container_1422834185427_0088_01_000023 on node host:
ip-10-171-0-122.ec2.internal:9103 #containers=2 available=5632 used=17408,
currently has 6 at priority 1; currentReservation 522242015-02-04 18:18:28,646
INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
(ResourceManager Event Processor): Reserved container
application=application_1422834185427_0088 resource=<memory:8704, vCores:1>
queue=default: capacity=1.0, absoluteCapacity=1.0,
usedResources=<memory:226304, vCores:26>usedCapacity=0.9822222,
absoluteUsedCapacity=0.9822222, numApps=1, numContainers=26
usedCapacity=0.9822222 absoluteUsedCapacity=0.9822222 used=<memory:226304,
vCores:26> cluster=<memory:230400, vCores:160>2015-02-04 18:18:28,646 INFO
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(ResourceManager Event Processor): Skipping scheduling since node
ip-10-171-0-122.ec2.internal:9103 is reserved by application
appattempt_1422834185427_0088_000001
From: Sandy Ryza <[email protected]>
To: Imran Rashid <[email protected]>
Cc: Michael Albert <[email protected]>; "[email protected]"
<[email protected]>
Sent: Wednesday, February 4, 2015 12:54 PM
Subject: Re: advice on diagnosing Spark stall for 1.5hr out of 3.5hr job?
Also, do you see any lines in the YARN NodeManager logs where it says that it's
killing a container?
-Sandy
On Wed, Feb 4, 2015 at 8:56 AM, Imran Rashid <[email protected]> wrote:
Hi Michael,
judging from the logs, it seems that those tasks are just working a really long
time. If you have long running tasks, then you wouldn't expect the driver to
output anything while those tasks are working.
What is unusual is that there is no activity during all that time the tasks are
executing. Are you sure you are looking at the activity of the executors (the
nodes that are actually running the tasks), and not the activity of the driver
node (the node where your "main" program lives, but that doesn't do any of the
distributed computation)? It would be perfectly normal for the driver node to
be idle while all the executors were busy with long running tasks.
I would look at:(a) the cpu usage etc. of the executor nodes during those long
running tasks(b) the thread dumps of the executors during those long running
tasks (available via the UI under the "Executors" tab, or just log into the
boxes and run jstack). Ideally this will point out a hotspot in your code that
is making these tasks take so long. (Or perhaps it'll point out what is going
on in spark internals that is so slow)(c) the summary metrics for the long
running stage, when it finally finishes (also available in the UI, under the
"Stages" tab). You will get a breakdown of how much time is spent in various
phases of the tasks, how much data is read, etc., which can help you figure out
why tasks are slow
Hopefully this will help you find out what is taking so long. If you find out
the executors really arent' doing anything during these really long tasks, it
would be great to find that out, and maybe get some more info for a bug report.
Imran
On Tue, Feb 3, 2015 at 6:18 PM, Michael Albert <[email protected]>
wrote:
Greetings!
First, my sincere thanks to all who have given me advice.Following previous
discussion, I've rearranged my code to try to keep the partitions to more
manageable sizes.Thanks to all who commented.
At the moment, the input set I'm trying to work with is about 90GB (avro
parquet format).
When I run on a reasonable chunk of the data (say half) things work reasonably.
On the full data, the spark process stalls.That is, for about 1.5 hours out of
a 3.5 hour run, I see no activity.No cpu usage, no error message, no network
activity.It just seems to sits there.The messages bracketing the stall are
shown below.
Any advice on how to diagnose this? I don't get any error messages. The spark
UI says that it is running a stage, but it makes no discernible
progress.Ganglia shows no CPU usage or network activity.When I shell into the
worker nodes there are no filled disks or other obvious problems.
How can I discern what Spark is waiting for?
The only weird thing seen, other than the stall, is that the yarn logs on the
workers have lines with messages like this:2015-02-03 22:59:58,890 INFO
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
(Container Monitor): Memory usage of ProcessTree 13158 for container-id
container_1422834185427_0083_01_000021: 7.1 GB of 8.5 GB physical memory used;
7.6 GB of 42.5 GB virtual memory used
It's rather strange that it mentions 42.5 GB of virtual memory. The machines
are EMR machines with 32 GB of physical memory and, as far as I can determine,
no swap space.
The messages bracketing the stall are shown below.
Any advice is welcome.
Thanks!
Sincerely, Mike Albert
Before the stall.15/02/03 21:45:28 INFO cluster.YarnClientClusterScheduler:
Removed TaskSet 5.0, whose tasks have all completed, from pool 15/02/03
21:45:28 INFO scheduler.DAGScheduler: Stage 5 (mapPartitionsWithIndex at
Transposer.scala:147) finished in 4880.317 s15/02/03 21:45:28 INFO
scheduler.DAGScheduler: looking for newly runnable stages15/02/03 21:45:28 INFO
scheduler.DAGScheduler: running: Set(Stage 3)15/02/03 21:45:28 INFO
scheduler.DAGScheduler: waiting: Set(Stage 6, Stage 7, Stage 8)15/02/03
21:45:28 INFO scheduler.DAGScheduler: failed: Set()15/02/03 21:45:28 INFO
scheduler.DAGScheduler: Missing parents for Stage 6: List(Stage 3)15/02/03
21:45:28 INFO scheduler.DAGScheduler: Missing parents for Stage 7: List(Stage
6)15/02/03 21:45:28 INFO scheduler.DAGScheduler: Missing parents for Stage 8:
List(Stage 7)At this point, I see no activity for 1.5 hours except for this
(XXX for I.P. address)15/02/03 22:13:24 INFO util.AkkaUtils: Connecting to
ExecutorActor:
akka.tcp://[email protected]:36301/user/ExecutorActor
Then finally it started again:15/02/03 23:31:34 INFO scheduler.TaskSetManager:
Finished task 1.0 in stage 3.0 (TID 7301) in 7208259 ms on
ip-10-171-0-124.ec2.internal (3/4)15/02/03 23:31:34 INFO
scheduler.TaskSetManager: Finished task 0.0 in stage 3.0 (TID 7300) in 7208503
ms on ip-10-171-0-128.ec2.internal (4/4)15/02/03 23:31:34 INFO
scheduler.DAGScheduler: Stage 3 (mapPartitions at Transposer.scala:211)
finished in 7209.534 s