[ 
https://issues.apache.org/jira/browse/HDFS-871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HDFS-871.
-----------------------------------

    Resolution: Fixed

likely stale.

> Balancer can hang in PendingBlockMove
> -------------------------------------
>
>                 Key: HDFS-871
>                 URL: https://issues.apache.org/jira/browse/HDFS-871
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer
>    Affects Versions: 0.20.1
>         Environment: Yahoo 0.20
>            Reporter: Andrew Ryan
>         Attachments: balancer-jstack.out
>
>
> We started the balancer, with default options (-threshold 10), and it ran 
> fine for a few hours, then hung. The process was still alive but no balancing 
> was taking place.
> At the time of the hang, jstack showed there were three threads in RUNNABLE 
> status. Subsequent jstacks taken minutes and hours later showed the same 
> three threads running in the same place, so I don't think this was a case 
> where requests were being restarted, it looks like hangs. My best guess is, 
> there's no timeout in the request to the namenode for these requests, and 
> there needs to be.
> I'll attach the full jstack output, but here's a sample thread, they are all 
> stuck in the same place.
> "pool-1-thread-972" prio=10 tid=0x00002aaafc23a800 nid=0x27a8 runnable 
> [0x00002a
> ab0a9a2000]
>    java.lang.Thread.State: RUNNABLE
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>         - locked <0x00002aaaebdbe158> (a java.io.BufferedInputStream)
>         at java.io.DataInputStream.readShort(DataInputStream.java:295)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Balancer$PendingBlockMove.receiveResponse(Balancer.java:371)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Balancer$PendingBlockMove.dispatch(Balancer.java:326)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Balancer$PendingBlockMove.access$1800(Balancer.java:232)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Balancer$PendingBlockMove$1.run(Balancer.java:393)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to