[ https://issues.apache.org/jira/browse/HDDS-12595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17939033#comment-17939033 ]
Ethan Rose commented on HDDS-12595: ----------------------------------- Adding some ideas for the implementation: * Users would need to authenticate as the SCM admin for this command to work * We can get the SCM container state from the [getContainer|https://github.com/apache/ozone/blob/2b48e8c6ec1739d541d5c02183ad1a91d9f7a308/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/client/ScmClient.java#L71] API in SCM. * We could use the [getContainerReplicas|https://github.com/apache/ozone/blob/2b48e8c6ec1739d541d5c02183ad1a91d9f7a308/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/client/ScmClient.java#L89] SCM API for replica info, but it is probably better to get this information from the datanodes since that is how the other replica verification checks work. * To get the information from the datanodes we can use the [readContainer API|https://github.com/apache/ozone/blob/b91292576e6a13c3c0149aca06847397b6ccc5a6/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/storage/ContainerProtocolCalls.java#L667] on the datanodes. ** We can use [ContainerOperationClient#readContainer|https://github.com/apache/ozone/blob/2b48e8c6ec1739d541d5c02183ad1a91d9f7a308/hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/ContainerOperationClient.java#L374] as a wrapper, which will fetch the container token automatically, but we may want to cache the tokens to avoid calling SCM every time, so the lower level version in {{ContainerProtocolCalls}} might be preferred. > Add verifier for container states > --------------------------------- > > Key: HDDS-12595 > URL: https://issues.apache.org/jira/browse/HDDS-12595 > Project: Apache Ozone > Issue Type: Sub-task > Reporter: Ethan Rose > Assignee: Tejaskriya > Priority: Major > > Add a verifier that checks the container state in SCM for all containers in a > replica. This can detect keys which are mapped to a container in a few bad > states: > * DELETING > * DELETED > * All replicas UNHEALTHY > * All replicas missing -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org