Riza Suminto created HDFS-13517:
-----------------------------------
Summary: MetaSave command can block NameNode for long time
Key: HDFS-13517
URL: https://issues.apache.org/jira/browse/HDFS-13517
Project: Hadoop HDFS
Issue Type: Improvement
Components: namenode
Affects Versions: 2.9.0
Reporter: Riza Suminto
hdfs metasave command do full iterations over BlockManager list, such as
neededReplications, postponedMisreplicatedBlocks, and so on. This does not
scale well when there are millions of under-replicated data blocks in the
cluster, due heavy load or network error.
We test this metasave command by modifying NNThroughputBenchmark to simulate
large number of under-replicated data blocks. We found that when there are
about 16 millions under-replicated blocks, metasave command can take up to 29
second while holding FNamesystem write lock. It is probably safer to cap the
iteration and output size of metasave command, so that it does not block
NameNode for too long.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]