Hao-Nan Zhu created HDFS-17619: ---------------------------------- Summary: Use ConcurrentHashMap to avoid synchronized block Key: HDFS-17619 URL: https://issues.apache.org/jira/browse/HDFS-17619 Project: Hadoop HDFS Issue Type: Improvement Components: server Affects Versions: 3.3.6, 3.3.0 Reporter: Hao-Nan Zhu
Hi, I’ve encountered performance bottlenecks in _PendingReconstructionBlocks_ that have a chance to be optimized. The [decrement|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReconstructionBlocks.java#L112] method in _hdfs.server.blockmanagment.PendingReconstructionBlocks_ encloses a synchronized block that locks {_}pendingReconstructions{_}. Within this synchronized block, it calls [decrementReplicas|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReconstructionBlocks.java#L237] function, which contains a loop that iterates over all the datanodes. This could take a long time if the number of datanodes is large, and eventually there is a chance of lock contention on the _pendingReconstructions_ object. To mitigate this while maintaining thread safety, the optimization could be using _ConcurrentHashMap_ for _pendingReconstructions_ and ensuring the access to _target_ is thread safe as well. A similar issue is also observed at [pendingReconcstructionCheck|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReconstructionBlocks.java#L277], which can be addressed with the same strategy. I’m looking into creating a patch for this, but before that, I wonder if it is worth optimizing. Also, please let me know if there is something wrong with my understanding or analysis. Thanks! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org