Chenyu Zheng created HDFS-17515: ----------------------------------- Summary: Erasure Coding: ErasureCodingWork is not effectively limited during a block reconstruction cycle. Key: HDFS-17515 URL: https://issues.apache.org/jira/browse/HDFS-17515 Project: Hadoop HDFS Issue Type: Improvement Reporter: Chenyu Zheng Assignee: Chenyu Zheng
In a block reconstruction cycle, ErasureCodingWork is not effectively limited. I add some debug log, log when ecBlocksToBeReplicated is an integer multiple of 100. {code:java} 2024-05-09 10:46:06,986 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: ecBlocksToBeReplicated for IP:PORT already have 100 blocks 2024-05-09 10:46:06,987 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: ecBlocksToBeReplicated for IP:PORT already have 200 blocks ... 2024-05-09 10:46:06,992 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: ecBlocksToBeReplicated for IP:PORT already have 2000 blocks 2024-05-09 10:46:06,992 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: ecBlocksToBeReplicated for IP:PORT already have 2100 blocks {code} During a block reconstruction cycle, ecBlocksToBeReplicated increases from 0 to 2100, This is much larger than replicationStreamsHardLimit. This brings unfairness and leads to a greater tendency to copy EC blocks. In fact, for non ec block, this is not a problem. pendingReplicationWithoutTargets increase when schedule work. When pendingReplicationWithoutTargets is too big, will not schedule work for this node. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org