[ 
https://issues.apache.org/jira/browse/IGNITE-24069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Pochatkin reassigned IGNITE-24069:
------------------------------------------

    Assignee: Vadim Kolodin

> Turn the pending assignments into a queue
> -----------------------------------------
>
>                 Key: IGNITE-24069
>                 URL: https://issues.apache.org/jira/browse/IGNITE-24069
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Denis Chudov
>            Assignee: Vadim Kolodin
>            Priority: Major
>              Labels: ignite-3
>
> *Motivation*
> In Raft, the configuration switch requires joint consensus, where the nodes 
> from old and new configurations are included with corresponding roles. So, we 
> cannot just include any node as a follower into the new configuration having 
> it as a learner in the previous one. The rule of joint consensus requires 
> that this node should be removed as a learner and after that included into 
> the next configuration as a peer, so there will be two configuration 
> switches. The downgrading should look the same.
> The handlers of the pending and stable assignments’ switch should be aware of 
> the changes when some node (let’s say, node A) is turned from a learner into 
> the peer or otherwise, from peer to learner. There should be two consequent 
> configuration switches for either upgrade or downgrade, where in the first 
> one, node A will be removed as the learner, in the second one, it will be 
> added as peer. 
> The values for meta storage pending assignments prefix "assignments.pending." 
> should be turned into a queue of pending assignments. It is created for a 
> replication group by the rebalance trigger or during the switch of planned 
> assignments to pending, when it is detected that the direct transition from 
> stable assignments to pending is not possible. It will store the queue of 
> assignments, where each of them will contain some intermediate state of Raft 
> configuration, and only the last assignments in the queue will be the target 
> assignments. 
> It is important that the whole queue is logically the one rebalance, 
> scheduled by a single trigger. It can be modified only in the process of 
> rebalancing. The meaning of stable and planned assignments is not changed, 
> and the stable assignments’ switch happens only after the whole pending 
> assignments queue has been processed. So, no replicas should be stopped until 
> that moment (only Raft configurations may be changed), because replicas are 
> stopped and storages are deleted only by the stable assignments’ change 
> listener.
> *Definition of done*
> Pending assignments are turned into a queue without the change in the logic. 
> This is the pre-requisite for further changes.
> Pending assignments’ change handler should process the first element of PAQ, 
> performing changePeersAndLearnersAsync() using assignments from it.
> Listeners of leader reeclection and primary replica change should also be 
> adjusted.
> *Implementation notes*
> There are 2 different pending assignments: for tables and for zones (until 
> data colocation is implemented and the responsibility for partitions is fully 
> transferred to zones): RebalanceUtil#PENDING_ASSIGNMENTS_PREFIX and 
> ZoneRebalanceUtil#PENDING_ASSIGNMENTS_PREFIX. This ticket is about them both.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to