[ https://issues.apache.org/jira/browse/IGNITE-24069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mikhail Pochatkin reassigned IGNITE-24069: ------------------------------------------ Assignee: Vadim Kolodin > Turn the pending assignments into a queue > ----------------------------------------- > > Key: IGNITE-24069 > URL: https://issues.apache.org/jira/browse/IGNITE-24069 > Project: Ignite > Issue Type: Improvement > Reporter: Denis Chudov > Assignee: Vadim Kolodin > Priority: Major > Labels: ignite-3 > > *Motivation* > In Raft, the configuration switch requires joint consensus, where the nodes > from old and new configurations are included with corresponding roles. So, we > cannot just include any node as a follower into the new configuration having > it as a learner in the previous one. The rule of joint consensus requires > that this node should be removed as a learner and after that included into > the next configuration as a peer, so there will be two configuration > switches. The downgrading should look the same. > The handlers of the pending and stable assignments’ switch should be aware of > the changes when some node (let’s say, node A) is turned from a learner into > the peer or otherwise, from peer to learner. There should be two consequent > configuration switches for either upgrade or downgrade, where in the first > one, node A will be removed as the learner, in the second one, it will be > added as peer. > The values for meta storage pending assignments prefix "assignments.pending." > should be turned into a queue of pending assignments. It is created for a > replication group by the rebalance trigger or during the switch of planned > assignments to pending, when it is detected that the direct transition from > stable assignments to pending is not possible. It will store the queue of > assignments, where each of them will contain some intermediate state of Raft > configuration, and only the last assignments in the queue will be the target > assignments. > It is important that the whole queue is logically the one rebalance, > scheduled by a single trigger. It can be modified only in the process of > rebalancing. The meaning of stable and planned assignments is not changed, > and the stable assignments’ switch happens only after the whole pending > assignments queue has been processed. So, no replicas should be stopped until > that moment (only Raft configurations may be changed), because replicas are > stopped and storages are deleted only by the stable assignments’ change > listener. > *Definition of done* > Pending assignments are turned into a queue without the change in the logic. > This is the pre-requisite for further changes. > Pending assignments’ change handler should process the first element of PAQ, > performing changePeersAndLearnersAsync() using assignments from it. > Listeners of leader reeclection and primary replica change should also be > adjusted. > *Implementation notes* > There are 2 different pending assignments: for tables and for zones (until > data colocation is implemented and the responsibility for partitions is fully > transferred to zones): RebalanceUtil#PENDING_ASSIGNMENTS_PREFIX and > ZoneRebalanceUtil#PENDING_ASSIGNMENTS_PREFIX. This ticket is about them both. -- This message was sent by Atlassian Jira (v8.20.10#820010)