Xi Fang created HDFS-5001:
-----------------------------

             Summary: Branch-1-Win TestAzureBlockPlacementPolicy and 
TestReplicationPolicyWithNodeGroup failed
                 Key: HDFS-5001
                 URL: https://issues.apache.org/jira/browse/HDFS-5001
             Project: Hadoop HDFS
          Issue Type: Bug
    Affects Versions: 1-win
            Reporter: Xi Fang
             Fix For: 1-win


After the backport patch of HDFS-4975 was committed, 
TestAzureBlockPlacementPolicy and TestReplicationPolicyWithNodeGroup failed. 
The cause for the failure of TestReplicationPolicyWithNodeGroup is that some 
part in the patch of HDFS-3941 is missing. Our patch for HADOOP-495 makes 
methods in super class to be called incorrectly. More specifically, HDFS-4975 
backported HDFS-4350, HDFS-4351, and HDFS-3912 to enable the method parameter 
"boolean avoidStaleNodes", and updated the APIs in BlockPlacementPolicyDefault. 
However, the override methods in AzureBlockPlacementPolicy and 
ReplicationPolicyWithNodeGroup weren't updated.

The cause for the failure of TestAzureBlockPlacementPolicy is similar.

In addition, TestAzureBlockPlacementPolicy has an error. Here is the error info.

Testcase: testPolicyWithDefaultRacks took 0.005 sec
Caused an ERROR
Invalid network topology. You cannot have a rack and a non-rack node at the 
same level of the network topology.
org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid network 
topology. You cannot have a rack and a non-rack node at the same level of the 
network topology.
at org.apache.hadoop.net.NetworkTopology.add(NetworkTopology.java:396)
at 
org.apache.hadoop.hdfs.server.namenode.TestAzureBlockPlacementPolicy.testPolicyWithDefaultRacks(TestAzureBlockPlacementPolicy.java:779)

The error is caused by a check in NetworkTopology#add(Node node)
{code}
if (depthOfAllLeaves != node.getLevel()) {
  LOG.error("Error: can't add leaf node at depth " +
      node.getLevel() + " to topology:\n" + oldTopoStr);
  throw new InvalidTopologyException("Invalid network topology. " +
      "You cannot have a rack and a non-rack node at the same " +
      "level of the network topology.");
}
{code}

The problem of this check is that when we use NetworkTopology#remove(Node node) 
to remove a node from the cluster, depthOfAllLeaves won't change. As a result, 
we can't reset the value of NetworkTopology#depathOfAllLeaves of the old 
topology of a cluster by just removing all its dataNode. See 
TestAzureBlockPlacementPolicy#testPolicyWithDefaultRacks()
// clear the old topology
for (Node node : dataNodes) {
  cluster.remove(node);
}



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to