[ 
https://issues.apache.org/jira/browse/HDDS-12109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929314#comment-17929314
 ] 

Ivan Andika commented on HDDS-12109:
------------------------------------

[~peterxcli] AFAIK, not possible currently. We might need to implement a new 
protocol in InterSCMProtocol.proto if one SCM wants to enquire other SCM's node 
ID (e.g. scm1 instead of uuid).

> Transfer leadership should not start until target SCM is out of safe mode
> -------------------------------------------------------------------------
>
>                 Key: HDDS-12109
>                 URL: https://issues.apache.org/jira/browse/HDDS-12109
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Ivan Andika
>            Assignee: Peter Lee
>            Priority: Major
>
> We encountered an incident where an administrator restarted an SCM and 
> transfer leadership to it immediately while it's still in safe mode. The 
> leadership was transferred to the SCM in safe mode. 
> However, the new leader cannot serve any requests causing user write requests 
> to block until the new leader SCM is out of safe mode.
> We can add a mechanism to prevent transfer leadership if the target SCM is 
> still in safe mode. 
> This can be implemented on Ozone / Ratis side. For Ratis, the possible idea 
> is to add another StateMachine API that will check whether a follower is 
> ready for a leader transfer. However, I think adding a simple check of 
> scmClient#inSafeMode should suffice, but we need to change it such that 
> scmClient#inSafeMode won't be directed to leader.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

Reply via email to