sodonnel commented on PR #8990: URL: https://github.com/apache/ozone/pull/8990#issuecomment-3241942511
The original Jira states that the set can hold 3 replicas max, which is not true. In the case of EC 10-4, it will hold 14 replicas. If there is over replication it could even be a few more than that. This change may be fine, but my initial thought is that we are going from a hash lookup to check for existence of a replica for additional or removal (container report processing, or dead node handling) to a list iteration, and therefore the runtime complexity is higher. Could this slow down container report processing? Do we have any option to reduce the initial hashSet initial size to 6 or 8 to waste less memory? I also haven't check where this code is called, but its unfortunate that a new hashSet needs to be constructed each time the replicas are retrieved. I understand why this is the case, as the list could otherwise be modified by container reporting or dead node handling after the copy has been retrieved. However, it would be relatively rare for the replicas for a container to change after the cluster has reached a stead state. Perhaps we can make the entries an un-modifiable set / list that can be returned to the caller directly, and then if the set / list needs to changed a new un-modifiable copy is created with the change to replace the original. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
