Chenyu Zheng created HDFS-17515:
-----------------------------------

             Summary: Erasure Coding: ErasureCodingWork is not effectively 
limited during a block reconstruction cycle.
                 Key: HDFS-17515
                 URL: https://issues.apache.org/jira/browse/HDFS-17515
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Chenyu Zheng
            Assignee: Chenyu Zheng


In a block reconstruction cycle, ErasureCodingWork is not effectively limited. 
I add some debug log, log when ecBlocksToBeReplicated is an integer multiple of 
100.

 
{code:java}
2024-05-09 10:46:06,986 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: 
ecBlocksToBeReplicated for IP:PORT already have 100 blocks
2024-05-09 10:46:06,987 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: 
ecBlocksToBeReplicated for IP:PORT already have 200 blocks
...
2024-05-09 10:46:06,992 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: 
ecBlocksToBeReplicated for IP:PORT already have 2000 blocks
2024-05-09 10:46:06,992 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: 
ecBlocksToBeReplicated for IP:PORT already have 2100 blocks {code}
 

During a block reconstruction cycle, ecBlocksToBeReplicated increases from 0 to 
2100, This is much larger than replicationStreamsHardLimit. This brings 
unfairness and leads to a greater tendency to copy EC blocks.

In fact, for non ec block, this is not a problem. 
pendingReplicationWithoutTargets increase when schedule work. When 
pendingReplicationWithoutTargets is too big, will not schedule work for this 
node.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to