A few improvements to DataNodeCluster - HADOOP-5556 
----------------------------------------------------

                 Key: HDFS-555
                 URL: https://issues.apache.org/jira/browse/HDFS-555
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: test
    Affects Versions: 0.21.0
            Reporter: Ravi Phulari
            Assignee: Hairong Kuang
             Fix For: 0.21.0


Opening jira to address HDFS code changes made in HADOOP-5556.

DataNodeCluster is a great tool to simulate a large scale DFS cluster using a 
small set of machines. A few suggestions to improve this tool:

   1. DataNodeCluster uses MiniDFSCluster#startDataNode to start multiple 
instances of DataNode on one machine. MiniDFSCluster sets DataNode's address to 
be 127.0.0.1. We should allow to set its address to 0.0.0.0 so DataNodes in 
different machines could communicate.
   2. Currently the size of the blocks injected to DataNode and created in 
CreatedEditsLog is hardcoded as 10. It would be more convenient if this could 
be configurable. Also we need to make sure that both use the same block size.
   3. If the replication factor of blocks is larger than 1, currently a 
DataNode in DataNodeCluster will be injected blocks multiple times and 
therefore it sends block reports to NameNode multiple times. Initial block 
reports contain only a portion of its blocks and therefore may cause 
unnecessary block replications. It would be cleaner if only one block report 
with all its blocks is sent.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to