[jira] [Updated] (HDDS-13639) optimize container iterator for frequent operation

Sumit Agrawal (Jira) Thu, 04 Sep 2025 10:05:44 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-13639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sumit Agrawal updated HDDS-13639:
---------------------------------
    Description: 
One one of environment having 900K containers in a datanode, looping to the 
list for *1k times* taken *2.5 minutes as* issue for pipelines in error were 1k.

So frequent looping large number of containers for filtering may take more time.

 

 

Need check for below cases:
1. Can avoid this as used for only metrics, or better optimize the code for 
loop for each volume

ContainerController.getContainerCount(HddsVolume) 
(org.apache.hadoop.ozone.container.ozoneimpl)
-- HddsVolume.getContainers() (org.apache.hadoop.ozone.container.common.volume)
-- – VolumeInfoMetrics.getContainers() 
(org.apache.hadoop.ozone.container.common.volume)
 

  was:
One one of environment having 900K containers in a datanode, looping to the 
list takes {*}2.5 minutes{*}, which is very slow.

 

Need check for below cases:
1. Can avoid this as used for only metrics, or better optimize the code for 
loop for each volume

ContainerController.getContainerCount(HddsVolume) 
(org.apache.hadoop.ozone.container.ozoneimpl)
HddsVolume.getContainers() (org.apache.hadoop.ozone.container.common.volume)
VolumeInfoMetrics.getContainers() 
(org.apache.hadoop.ozone.container.common.volume)
 
2. can avoid copy of containerSet, and use direct map iterator
 
org.apache.hadoop.ozone.container.common.impl.ContainerSet#getContainerReport
 
This is frequest of metric polled by prometheus is generally 30sec.
 
{code:java}
List<Container<?>> containers = new ArrayList<>(containerMap.values()); {code}
 


> optimize container iterator for frequent operation
> --------------------------------------------------
>
>                 Key: HDDS-13639
>                 URL: https://issues.apache.org/jira/browse/HDDS-13639
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Sumit Agrawal
>            Priority: Major
>
> One one of environment having 900K containers in a datanode, looping to the 
> list for *1k times* taken *2.5 minutes as* issue for pipelines in error were 
> 1k.
> So frequent looping large number of containers for filtering may take more 
> time.
>  
>  
> Need check for below cases:
> 1. Can avoid this as used for only metrics, or better optimize the code for 
> loop for each volume
> ContainerController.getContainerCount(HddsVolume) 
> (org.apache.hadoop.ozone.container.ozoneimpl)
> -- HddsVolume.getContainers() 
> (org.apache.hadoop.ozone.container.common.volume)
> -- – VolumeInfoMetrics.getContainers() 
> (org.apache.hadoop.ozone.container.common.volume)
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-13639) optimize container iterator for frequent operation

Reply via email to