Vinay created HDFS-5504:
---------------------------

             Summary: In HA mode, OP_DELETE_SNAPSHOT is not decrementing the 
safemode threshold, leads to NN safemode.
                 Key: HDFS-5504
                 URL: https://issues.apache.org/jira/browse/HDFS-5504
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: snapshots
            Reporter: Vinay
            Assignee: Vinay


1. HA installation, standby NN is down.
2. delete snapshot is called and it has deleted the blocks from blocksmap and 
all datanodes. log sync also happened.
3. before next log roll NN crashed
4. When the namenode restartes then it will fsimage and finalized edits from 
shared storage and set the safemode threshold. which includes blocks from 
deleted snapshot also. (because this edits is not yet read as namenode is 
restarted before the last edits segment is not finalized)
5. When it becomes active, it finalizes the edits and read the delete snapshot 
edits_op. but at this time, it was not reducing the safemode count. and it will 
continuing in safemode.
6. On next restart, as the edits is already finalized, on startup only it will 
read and set the safemode threshold correctly.

But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to