Haohui Mai created HDFS-8782: -------------------------------- Summary: Upgrade to block ID-based DN storage layout delays DN registration Key: HDFS-8782 URL: https://issues.apache.org/jira/browse/HDFS-8782 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Priority: Critical
We have seen multiple incidents at production sites that there are long delays for DNs to register to the NN when upgrading to post 2.6 release. Further investigation shows that the DN is blocked when upgrading the storage layout introduced in HDFS-6482. The new storage layout requires making up to 64k directories in the underlying file system. Unfortunately the current implementation calls {{mkdirs()}} sequentially and upgrades each volume in sequential order. As a result, upgrading a DN with a lot of disks or with blocks that have random block ID takes a long time (usually in hours), and the DN won't register to the NN unless it finishes upgrading all the storage directory. The excessive delays confuse operations and break the assumption of rolling upgrades. -- This message was sent by Atlassian JIRA (v6.3.4#6332)