Re: [PR] HDDS-7543. ContainerStateMap uses HashSet to store container replica lists [ozone]

via GitHub Mon, 01 Sep 2025 04:03:46 -0700


sodonnel commented on PR #8990:
URL: https://github.com/apache/ozone/pull/8990#issuecomment-3241942511


   The original Jira states that the set can hold 3 replicas max, which is not 
true. In the case of EC 10-4, it will hold 14 replicas. If there is over 
replication it could even be a few more than that.
   
   This change may be fine, but my initial thought is that we are going from a 
hash lookup to check for existence of a replica for additional or removal 
(container report processing, or dead node handling) to a list iteration, and 
therefore the runtime complexity is higher. Could this slow down container 
report processing?
   
   Do we have any option to reduce the initial hashSet initial size to 6 or 8 
to waste less memory?
   
   I also haven't check where this code is called, but its unfortunate that a 
new hashSet needs to be constructed each time the replicas are retrieved. I 
understand why this is the case, as the list could otherwise be modified by 
container reporting or dead node handling after the copy has been retrieved. 
However, it would be relatively rare for the replicas for a container to change 
after the cluster has reached a stead state. Perhaps we can make the entries an 
un-modifiable set / list that can be returned to the caller directly, and then 
if the set / list needs to changed a new un-modifiable copy is created with the 
change to replace the original.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDDS-7543. ContainerStateMap uses HashSet to store container replica lists [ozone]

Reply via email to