Re: advice on diagnosing Spark stall for 1.5hr out of 3.5hr job?

Sandy Ryza Sun, 08 Feb 2015 20:13:39 -0800

I wouldn't be concerned by those ResourceManager log messages.  What would
be concerning would be if the NodeManager reported that it was killing
containers for exceeding resource limits.


-Sandy

On Wed, Feb 4, 2015 at 10:19 AM, Michael Albert <m_albert...@yahoo.com>
wrote:

> Greetings!
>
> Thanks to all who have taken the time to look at this.
>
> While the process is stalled, I see, in the yarn log on the head node,
> repeating messages of the form "Trying to fulfill reservation for
> application XXX on node YYY", but that node is is reserved by XXX_000001.
> Below is a chunk of the log.
>
> Any suggestions as to what I can investigate?
>
> Thanks!
> -Mike
>
> 2015-02-04 18:18:28,617 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
> (ResourceManager Event Processor): Trying to fulfill reservation for
> application application_1422834185427_0088 on node:
> ip-10-171-0-124.ec2.internal:9103
> 2015-02-04 18:18:28,617 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
> (ResourceManager Event Processor): assignContainers:
> node=ip-10-171-0-124.ec2.internal application=88 priority=1
> request={Priority: 1, Capability: <memory:8704, vCores:1>, # Containers:
> 17, Labels: , Location: *, Relax Locality: true} type=OFF_SWITCH
> 2015-02-04 18:18:28,617 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
> (ResourceManager Event Processor): Application
> application_1422834185427_0088 reserved container
> container_1422834185427_0088_01_000025 on node host:
> ip-10-171-0-124.ec2.internal:9103 #containers=2 available=5632 used=17408,
> currently has 6 at priority 1; currentReservation 52224
> 2015-02-04 18:18:28,618 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
> (ResourceManager Event Processor): Reserved container
> application=application_1422834185427_0088 resource=<memory:8704, vCores:1>
> queue=default: capacity=1.0, absoluteCapacity=1.0,
> usedResources=<memory:226304, vCores:26>usedCapacity=0.9822222,
> absoluteUsedCapacity=0.9822222, numApps=1, numContainers=26
> usedCapacity=0.9822222 absoluteUsedCapacity=0.9822222 used=<memory:226304,
> vCores:26> cluster=<memory:230400, vCores:160>
> 2015-02-04 18:18:28,618 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
> (ResourceManager Event Processor): Skipping scheduling since node
> ip-10-171-0-124.ec2.internal:9103 is reserved by application
> appattempt_1422834185427_0088_000001
> 2015-02-04 18:18:28,623 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
> (ResourceManager Event Processor): default usedResources: <memory:226304,
> vCores:26> clusterResources: <memory:230400, vCores:160> currentCapacity
> 0.9822222 required <memory:8704, vCores:1> potentialNewCapacity: 1.02 (
> max-capacity: 1.0)
> 2015-02-04 18:18:28,625 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
> (ResourceManager Event Processor): Trying to fulfill reservation for
> application application_1422834185427_0088 on node:
> ip-10-171-0-119.ec2.internal:9103
> 2015-02-04 18:18:28,625 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
> (ResourceManager Event Processor): assignContainers:
> node=ip-10-171-0-119.ec2.internal application=88 priority=1
> request={Priority: 1, Capability: <memory:8704, vCores:1>, # Containers:
> 17, Labels: , Location: *, Relax Locality: true} type=OFF_SWITCH
> 2015-02-04 18:18:28,626 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
> (ResourceManager Event Processor): Application
> application_1422834185427_0088 reserved container
> container_1422834185427_0088_01_000026 on node host:
> ip-10-171-0-119.ec2.internal:9103 #containers=2 available=5632 used=17408,
> currently has 6 at priority 1; currentReservation 52224
> 2015-02-04 18:18:28,626 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
> (ResourceManager Event Processor): Reserved container
> application=application_1422834185427_0088 resource=<memory:8704, vCores:1>
> queue=default: capacity=1.0, absoluteCapacity=1.0,
> usedResources=<memory:226304, vCores:26>usedCapacity=0.9822222,
> absoluteUsedCapacity=0.9822222, numApps=1, numContainers=26
> usedCapacity=0.9822222 absoluteUsedCapacity=0.9822222 used=<memory:226304,
> vCores:26> cluster=<memory:230400, vCores:160>
> 2015-02-04 18:18:28,626 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
> (ResourceManager Event Processor): Skipping scheduling since node
> ip-10-171-0-119.ec2.internal:9103 is reserved by application
> appattempt_1422834185427_0088_000001
> 2015-02-04 18:18:28,636 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
> (ResourceManager Event Processor): Trying to fulfill reservation for
> application application_1422834185427_0088 on node:
> ip-10-171-0-129.ec2.internal:9103
> 2015-02-04 18:18:28,636 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
> (ResourceManager Event Processor): assignContainers:
> node=ip-10-171-0-129.ec2.internal application=88 priority=1
> request={Priority: 1, Capability: <memory:8704, vCores:1>, # Containers:
> 17, Labels: , Location: *, Relax Locality: true} type=OFF_SWITCH
> 2015-02-04 18:18:28,636 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
> (ResourceManager Event Processor): Application
> application_1422834185427_0088 reserved container
> container_1422834185427_0088_01_000021 on node host:
> ip-10-171-0-129.ec2.internal:9103 #containers=2 available=5632 used=17408,
> currently has 6 at priority 1; currentReservation 52224
> 2015-02-04 18:18:28,636 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
> (ResourceManager Event Processor): Reserved container
> application=application_1422834185427_0088 resource=<memory:8704, vCores:1>
> queue=default: capacity=1.0, absoluteCapacity=1.0,
> usedResources=<memory:226304, vCores:26>usedCapacity=0.9822222,
> absoluteUsedCapacity=0.9822222, numApps=1, numContainers=26
> usedCapacity=0.9822222 absoluteUsedCapacity=0.9822222 used=<memory:226304,
> vCores:26> cluster=<memory:230400, vCores:160>
> 2015-02-04 18:18:28,636 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
> (ResourceManager Event Processor): Skipping scheduling since node
> ip-10-171-0-129.ec2.internal:9103 is reserved by application
> appattempt_1422834185427_0088_000001
> 2015-02-04 18:18:28,645 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
> (ResourceManager Event Processor): default usedResources: <memory:226304,
> vCores:26> clusterResources: <memory:230400, vCores:160> currentCapacity
> 0.9822222 required <memory:8704, vCores:1> potentialNewCapacity: 1.02 (
> max-capacity: 1.0)
> 2015-02-04 18:18:28,646 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
> (ResourceManager Event Processor): Trying to fulfill reservation for
> application application_1422834185427_0088 on node:
> ip-10-171-0-122.ec2.internal:9103
> 2015-02-04 18:18:28,646 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
> (ResourceManager Event Processor): assignContainers:
> node=ip-10-171-0-122.ec2.internal application=88 priority=1
> request={Priority: 1, Capability: <memory:8704, vCores:1>, # Containers:
> 17, Labels: , Location: *, Relax Locality: true} type=OFF_SWITCH
> 2015-02-04 18:18:28,646 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt
> (ResourceManager Event Processor): Application
> application_1422834185427_0088 reserved container
> container_1422834185427_0088_01_000023 on node host:
> ip-10-171-0-122.ec2.internal:9103 #containers=2 available=5632 used=17408,
> currently has 6 at priority 1; currentReservation 52224
> 2015-02-04 18:18:28,646 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue
> (ResourceManager Event Processor): Reserved container
> application=application_1422834185427_0088 resource=<memory:8704, vCores:1>
> queue=default: capacity=1.0, absoluteCapacity=1.0,
> usedResources=<memory:226304, vCores:26>usedCapacity=0.9822222,
> absoluteUsedCapacity=0.9822222, numApps=1, numContainers=26
> usedCapacity=0.9822222 absoluteUsedCapacity=0.9822222 used=<memory:226304,
> vCores:26> cluster=<memory:230400, vCores:160>
> 2015-02-04 18:18:28,646 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
> (ResourceManager Event Processor): Skipping scheduling since node
> ip-10-171-0-122.ec2.internal:9103 is reserved by application
> appattempt_1422834185427_0088_000001
>
>
>   ------------------------------
>  *From:* Sandy Ryza <sandy.r...@cloudera.com>
> *To:* Imran Rashid <iras...@cloudera.com>
> *Cc:* Michael Albert <m_albert...@yahoo.com>; "user@spark.apache.org" <
> user@spark.apache.org>
> *Sent:* Wednesday, February 4, 2015 12:54 PM
> *Subject:* Re: advice on diagnosing Spark stall for 1.5hr out of 3.5hr
> job?
>
> Also, do you see any lines in the YARN NodeManager logs where it says that
> it's killing a container?
>
> -Sandy
>
>
>
> On Wed, Feb 4, 2015 at 8:56 AM, Imran Rashid <iras...@cloudera.com> wrote:
>
> Hi Michael,
>
> judging from the logs, it seems that those tasks are just working a really
> long time.  If you have long running tasks, then you wouldn't expect the
> driver to output anything while those tasks are working.
>
> What is unusual is that there is no activity during all that time the
> tasks are executing.  Are you sure you are looking at the activity of the
> executors (the nodes that are actually running the tasks), and not the
> activity of the driver node (the node where your "main" program lives, but
> that doesn't do any of the distributed computation)?  It would be perfectly
> normal for the driver node to be idle while all the executors were busy
> with long running tasks.
>
> I would look at:
> (a) the cpu usage etc. of the executor nodes during those long running
> tasks
> (b) the thread dumps of the executors during those long running tasks
> (available via the UI under the "Executors" tab, or just log into the boxes
> and run jstack).  Ideally this will point out a hotspot in your code that
> is making these tasks take so long.  (Or perhaps it'll point out what is
> going on in spark internals that is so slow)
> (c) the summary metrics for the long running stage, when it finally
> finishes (also available in the UI, under the "Stages" tab).  You will get
> a breakdown of how much time is spent in various phases of the tasks, how
> much data is read, etc., which can help you figure out why tasks are slow
>
>
> Hopefully this will help you find out what is taking so long.  If you find
> out the executors really arent' doing anything during these really long
> tasks, it would be great to find that out, and maybe get some more info for
> a bug report.
>
> Imran
>
>
> On Tue, Feb 3, 2015 at 6:18 PM, Michael Albert <
> m_albert...@yahoo.com.invalid> wrote:
>
> Greetings!
>
> First, my sincere thanks to all who have given me advice.
> Following previous discussion, I've rearranged my code to try to keep the
> partitions to more manageable sizes.
> Thanks to all who commented.
>
> At the moment, the input set I'm trying to work with is about 90GB (avro
> parquet format).
>
> When I run on a reasonable chunk of the data (say half) things work
> reasonably.
>
> On the full data, the spark process stalls.
> That is, for about 1.5 hours out of a 3.5 hour run, I see no activity.
> No cpu usage, no error message, no network activity.
> It just seems to sits there.
> The messages bracketing the stall are shown below.
>
> Any advice on how to diagnose this?
> I don't get any error messages.
> The spark UI says that it is running a stage, but it makes no discernible
> progress.
> Ganglia shows no CPU usage or network activity.
> When I shell into the worker nodes there are no filled disks or other
> obvious problems.
>
> How can I discern what Spark is waiting for?
>
> The only weird thing seen, other than the stall, is that the yarn logs on
> the workers have lines with messages like this:
> 2015-02-03 22:59:58,890 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl
> (Container Monitor): Memory usage of ProcessTree 13158 for container-id
> container_1422834185427_0083_01_000021: 7.1 GB of 8.5 GB physical memory
> used; 7.6 GB of 42.5 GB virtual memory used
>
> It's rather strange that it mentions 42.5 GB of virtual memory.  The
> machines are EMR machines with 32 GB of physical memory and, as far as I
> can determine, no swap space.
>
> The messages bracketing the stall are shown below.
>
>
> Any advice is welcome.
>
> Thanks!
>
> Sincerely,
>  Mike Albert
>
> Before the stall.
> 15/02/03 21:45:28 INFO cluster.YarnClientClusterScheduler: Removed TaskSet
> 5.0, whose tasks have all completed, from pool
> 15/02/03 21:45:28 INFO scheduler.DAGScheduler: Stage 5
> (mapPartitionsWithIndex at Transposer.scala:147) finished in 4880.317 s
> 15/02/03 21:45:28 INFO scheduler.DAGScheduler: looking for newly runnable
> stages
> 15/02/03 21:45:28 INFO scheduler.DAGScheduler: running: Set(Stage 3)
> 15/02/03 21:45:28 INFO scheduler.DAGScheduler: waiting: Set(Stage 6, Stage
> 7, Stage 8)
> 15/02/03 21:45:28 INFO scheduler.DAGScheduler: failed: Set()
> 15/02/03 21:45:28 INFO scheduler.DAGScheduler: Missing parents for Stage
> 6: List(Stage 3)
> 15/02/03 21:45:28 INFO scheduler.DAGScheduler: Missing parents for Stage
> 7: List(Stage 6)
> 15/02/03 21:45:28 INFO scheduler.DAGScheduler: Missing parents for Stage
> 8: List(Stage 7)
> At this point, I see no activity for 1.5 hours except for this (XXX for
> I.P. address)
> 15/02/03 22:13:24 INFO util.AkkaUtils: Connecting to ExecutorActor:
> akka.tcp://sparkExecutor@ip-XXX.ec2.internal:36301/user/ExecutorActor
>
> Then finally it started again:
> 15/02/03 23:31:34 INFO scheduler.TaskSetManager: Finished task 1.0 in
> stage 3.0 (TID 7301) in 7208259 ms on ip-10-171-0-124.ec2.internal (3/4)
> 15/02/03 23:31:34 INFO scheduler.TaskSetManager: Finished task 0.0 in
> stage 3.0 (TID 7300) in 7208503 ms on ip-10-171-0-128.ec2.internal (4/4)
> 15/02/03 23:31:34 INFO scheduler.DAGScheduler: Stage 3 (mapPartitions at
> Transposer.scala:211) finished in 7209.534 s
>
>
>
>
>
>
>
>

Re: advice on diagnosing Spark stall for 1.5hr out of 3.5hr job?

Reply via email to