[ https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer resolved HDFS-1262. ------------------------------------ Resolution: Won't Fix > Failed pipeline creation during append leaves lease hanging on NN > ----------------------------------------------------------------- > > Key: HDFS-1262 > URL: https://issues.apache.org/jira/browse/HDFS-1262 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client, namenode > Affects Versions: 0.20-append > Reporter: Todd Lipcon > Assignee: sam rash > Priority: Critical > Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt, hdfs-1262-3.txt, > hdfs-1262-4.txt, hdfs-1262-5.txt > > > Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened > was the following: > 1) File's original writer died > 2) Recovery client tried to open file for append - looped for a minute or so > until soft lease expired, then append call initiated recovery > 3) Recovery completed successfully > 4) Recovery client calls append again, which succeeds on the NN > 5) For some reason, the block recovery that happens at the start of append > pipeline creation failed on all datanodes 6 times, causing the append() call > to throw an exception back to HBase master. HBase assumed the file wasn't > open and put it back on a queue to try later > 6) Some time later, it tried append again, but the lease was still assigned > to the same DFS client, so it wasn't able to recover. > The recovery failure in step 5 is a separate issue, but the problem for this > JIRA is that the NN can think it failed to open a file for append when the NN > thinks the writer holds a lease. Since the writer keeps renewing its lease, > recovery never happens, and no one can open or recover the file until the DFS > client shuts down. -- This message was sent by Atlassian JIRA (v6.3.4#6332)