Mirza Aliev created IGNITE-24538:
------------------------------------

             Summary: Describe algorithm in the HA mode which solves force 
reset rewriting scheduled rebalance problem 
                 Key: IGNITE-24538
                 URL: https://issues.apache.org/jira/browse/IGNITE-24538
             Project: Ignite
          Issue Type: Task
            Reporter: Mirza Aliev


h3. Motivation

While we was implementing task with extending test coverage 
(https://issues.apache.org/jira/browse/IGNITE-24410), we have found a scenario 
where we lost majority, change filter and after that automatic reset try to 
reset majority, it turned out that force reset rewrites intent to rebalance 
partition that was triggered by filter change, and moreover, this intent won't 
be triggered again. In that case user needs to change filter again, which looks 
a bit odd.

Detailed scenario: 

*Precondition*
 - Create an HA zone with a filter that allows nodes A, B and C.
 - Make sure {{partitionDistributionResetTimeout}} is high enough not to 
trigger before the following actions happen
 - Stop nodes B and C
 - Change zone filter to allow nodes D, E and F. These new nodes should be up 
and running
 - Change {{partitionDistributionResetTimeout}} to a smaller value or 0 to 
trigger automatic reset

*Result*

The partition remains on node A

*Expected result*

The partition is moved to D, E and F as per the filter



The same problem could happen to scale up/scale down or other rebalance 
triggers, which are about to be applied, but all of sudden majority is lost and 
such intents will be lost.

In this task we need to come up with algorithm to resolve such issues.

Expected behaviour is that filter change event, or other events that changed 
assignments must be some how applied when majority was lost concurrently after 
such events. 

h3. Implementation notes

Some ideas, that could help:

when we do 2 phase reset, when we write single node with most up-to-date raft 
log to pending in froce manner, we also write planned with the rest alive nodes 
from stable. Instead of writing such planned, we could write planned in 
accordance with the pending that was before the moment we try to write new 
force pending.

In our example, we could write D,E,F to planned instead of empty set.

h3. Definition of Done

* We have described algorithm that solves the problem of lost pending 
rebalances when we write force pending on majority lost   




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to