dragon created HDFS-10106: ----------------------------- Summary: CLONE - Erasure coding: fix priority level of UnderReplicatedBlocks for striped block Key: HDFS-10106 URL: https://issues.apache.org/jira/browse/HDFS-10106 Project: Hadoop HDFS Issue Type: Sub-task Reporter: dragon Assignee: Walter Su Fix For: HDFS-7285
Issues 1: correctly mark corrupted blocks. Issues 2: distinguish highest risk priority and normal risk priority. {code:title=UnderReplicatedBlocks.java} private int getPriority(int curReplicas, ... } else if (curReplicas == 1) { //only on replica -risk of loss // highest priority return QUEUE_HIGHEST_PRIORITY; ... {code} For stripe blocks, we should return QUEUE_HIGHEST_PRIORITY when curReplicas == 6( Suppose 6+3 schema). That's important. Because {code:title=BlockManager.java} DatanodeDescriptor[] chooseSourceDatanodes(BlockInfo block, ... if(priority != UnderReplicatedBlocks.QUEUE_HIGHEST_PRIORITY && !node.isDecommissionInProgress() && node.getNumberOfBlocksToBeReplicated() >= maxReplicationStreams) { continue; // already reached replication limit } ... {code} It may return not enough source DNs ( maybe 5), and failed to recover. A busy node should not be skiped if a block has highest risk/priority. The issue is the striped block doesn't have priority. -- This message was sent by Atlassian JIRA (v6.3.4#6332)