He Xiaoqiao created HDFS-14559:
----------------------------------

             Summary: Optimizing safemode leave mechanism
                 Key: HDFS-14559
                 URL: https://issues.apache.org/jira/browse/HDFS-14559
             Project: Hadoop HDFS
          Issue Type: Sub-task
          Components: namenode
            Reporter: He Xiaoqiao
            Assignee: He Xiaoqiao


As HDFS-14186 mentioned, The last stage of namenode startup, it will leave 
safemode based on the condition that if blocks num reach to threshold. However 
the current condition is complete based on total blocks rather than total 
replications. So for a large cluster, after total blocks has reported from 
datanode, there are still large block replication pending report and load of 
namenode is continue high for long times. In some extreme case, between leave 
safemode time and process block report completely, namenode will not provide 
normal service and some datanodes could dead then register/blockreport again 
and again.
In one word, we need to upgrade safemode leave mechanism to support large 
cluster restart smooth.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to