Hao-Nan Zhu created HDFS-17627: ---------------------------------- Summary: Performance optimization on BlockUnderConstructionFeature Key: HDFS-17627 URL: https://issues.apache.org/jira/browse/HDFS-17627 Project: Hadoop HDFS Issue Type: Improvement Components: server Affects Versions: 3.3.0 Reporter: Hao-Nan Zhu
Hi, I’ve encountered performance bottlenecks in _blockmanagement.BlockUnderConstructionFeature_ and I wonder if there's a chance for optimization. [_getStaleReplica()_|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockUnderConstructionFeature.java#L219] may cause performance degradation when the list of replicas is large. The method uses an *ArrayList* to collect stale replicas, which could cause memory re-allocations and potential OOM errors when the number of stale replicas increases. Furthermore, [_getStaleReplica()_|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockUnderConstructionFeature.java#L219] could also cause lock contention at some code paths like: [_updatePipelineInternal()_|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java#L6054] (holding global lock) -> [_updateLastBlock()_|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L4965] -> [_setGenerationStampAndVerifyReplicas_|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java#L426]{_}(){_} -> [_getStaleReplica()_|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockUnderConstructionFeature.java#L219]. The optimization could be pre-sizing the ArrayList based on the actual number of replicas (i.e. _List<ReplicaUnderConstruction> staleReplicas = new ArrayList<>(replicas.length)_ ), which could minimize the number of times resizing or reallocations. Another way to do the optimization is to have a persisted list of {_}staleReplicas{_}, so there is no need to iterate over the replicas. Same issue could also happen with [_appendUCPartsConcise()_|https://github.com/apache/hadoop/blob/6be04633b55bbd67c2875e39977cd9d2308dc1d1/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockUnderConstructionFeature.java#L349]. It takes in a StringBuilder with a [default size of 150|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirWriteFileOp.java#L792] characters, which leads to risks of resizing when the number of replicas is large. Within {_}BlockUnderConstructionFeature{_}, there are other similar issues exist, including [_addReplicaIfNotPresent()_|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockUnderConstructionFeature.java#L294] or [_setExpectedLocations()_|https://github.com/apache/hadoop/blob/2f0dd7c4feb1e482d47786d26d6d32483f39414b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockUnderConstructionFeature.java#L74]. Please let me know if there is something wrong with the analysis above, or any comments on the optimization. Thanks! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org