[ https://issues.apache.org/jira/browse/HDDS-12109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929314#comment-17929314 ]
Ivan Andika commented on HDDS-12109: ------------------------------------ [~peterxcli] AFAIK, not possible currently. We might need to implement a new protocol in InterSCMProtocol.proto if one SCM wants to enquire other SCM's node ID (e.g. scm1 instead of uuid). > Transfer leadership should not start until target SCM is out of safe mode > ------------------------------------------------------------------------- > > Key: HDDS-12109 > URL: https://issues.apache.org/jira/browse/HDDS-12109 > Project: Apache Ozone > Issue Type: Sub-task > Reporter: Ivan Andika > Assignee: Peter Lee > Priority: Major > > We encountered an incident where an administrator restarted an SCM and > transfer leadership to it immediately while it's still in safe mode. The > leadership was transferred to the SCM in safe mode. > However, the new leader cannot serve any requests causing user write requests > to block until the new leader SCM is out of safe mode. > We can add a mechanism to prevent transfer leadership if the target SCM is > still in safe mode. > This can be implemented on Ozone / Ratis side. For Ratis, the possible idea > is to add another StateMachine API that will check whether a follower is > ready for a leader transfer. However, I think adding a simple check of > scmClient#inSafeMode should suffice, but we need to change it such that > scmClient#inSafeMode won't be directed to leader. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org