Hao-Nan Zhu created HDFS-17619:
----------------------------------

             Summary: Use ConcurrentHashMap to avoid synchronized block
                 Key: HDFS-17619
                 URL: https://issues.apache.org/jira/browse/HDFS-17619
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: server
    Affects Versions: 3.3.6, 3.3.0
            Reporter: Hao-Nan Zhu


Hi, I’ve encountered performance bottlenecks in _PendingReconstructionBlocks_ 
that have a chance to be optimized.

 

The 
[decrement|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReconstructionBlocks.java#L112]
 method in _hdfs.server.blockmanagment.PendingReconstructionBlocks_ encloses a 
synchronized block that locks {_}pendingReconstructions{_}. Within this 
synchronized block, it calls 
[decrementReplicas|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReconstructionBlocks.java#L237]
 function, which contains a loop that iterates over all the datanodes. This 
could take a long time if the number of datanodes is large, and eventually 
there is a chance of lock contention on the _pendingReconstructions_ object. 

 

To mitigate this while maintaining thread safety, the optimization could be 
using _ConcurrentHashMap_ for _pendingReconstructions_ and ensuring the access 
to _target_ is thread safe as well. A similar issue is also observed at 
[pendingReconcstructionCheck|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReconstructionBlocks.java#L277],
 which can be addressed with the same strategy.

 

I’m looking into creating a patch for this, but before that, I wonder if it is 
worth optimizing. Also, please let me know if there is something wrong with my 
understanding or analysis. Thanks!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to