Walter Su created HDFS-9122:
-------------------------------

             Summary: DN automatically add more volumes to avoid large volume
                 Key: HDFS-9122
                 URL: https://issues.apache.org/jira/browse/HDFS-9122
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Walter Su


Currently if a DataNode has too many blocks, it partition blockReport by 
storage. In practice, we've seen a single storage can contains large amount of 
blocks and the report even exceeds the max RPC data length. Storage density 
increases quickly, a DataNode can hold more and more blocks. It's harder to 
include so many blocks in one RPC report. One option is "Support splitting 
BlockReport of a storage into multiple RPC"(HDFS-9011). 

I'm thinking maybe we could add more "logical" volumes (more storage 
directories in one device). DataNodeStorageInfo in NameNode is cheap. And 
Processing a single blockReport need NN hold the lock, so splitting one big 
volume to many volume can avoid a single processing hold lock too long.

We can support wildcard in dfs.datanode.data.dir. Like 
/physical-volume/dfs/data/dir*
When a volume exceeds threshold(like 1m blocks), DN automatically create a new 
storage directory, also a volume. We have to change 
RoundRobinVolumeChoosingPolicy as well, once we chosen a physical volume, we 
choose the logical volume which has least number of blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to