[ 
https://issues.apache.org/jira/browse/IGNITE-23735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mirza Aliev updated IGNITE-23735:
---------------------------------
    Description: 
h3. Motivation

In https://issues.apache.org/jira/browse/IGNITE-22904 was implemented logic, 
which prevents to leader hijack. More details could be found in the ticket 
description, briefly, when node come back after majority reset, it might still 
think it's a member of the voting set (judging by its local partition Raft 
log), so it might propose itself as a candidate, and it can win the election if 
there are enough such nodes. This will result in the leadership being hijacked 
by the 'old' majority, which will mess the repaired partition majority up.

Let's consider the example:

# Replication factor is set to 3, assignments = peers = ABC; the index with 
configuration ABC is 10.
# ABC nodes fail.
# The group is repaired, assignments = peers = CDE, with C as the leader.
# AB nodes recover.
# The replication factor changes to 5, assignments = ABCDE; the index with 
configuration ABCDE is 20.
# C replicates the log to AB up to index 10 and then fails.
# AB assume the configuration is ABC and start an election, electing A as 
leader before receiving vote requests from D or E.

As a result, A is elected leader, even though it is not the most up-to-date 
node in terms of the log (since CDE has a more advanced log). This violates 
Raft's invariants.

In this ticket we must reuse logic with setting fake conf when node receive 
data from new raft group (see {{NodeImpl#refreshLeadershipAbstaining}}) and 
logic of externally enforced config index (see 
{{RaftGroupOptions#externallyEnforcedConfigIndex()}}), so nodes that restart 
and receive raft log won't try to elect leader untill all data is replicated. 

h3. Definition of done 

* Nodes that join after partition majority reset must not elect a leader from 
the old majority that could hijack leadership and cause havoc in the repaired 
group. 

  was:
h3. Motivation

In https://issues.apache.org/jira/browse/IGNITE-22904 was implemented logic, 
which prevents to leader hijack. More details could be found in the ticket 
description, briefly, when node come back after majority reset, it might still 
think it's a member of the voting set (judging by its local partition Raft 
log), so it might propose itself as a candidate, and it can win the election if 
there are enough such nodes. This will result in the leadership being hijacked 
by the 'old' majority, which will mess the repaired partition majority up.

Let's consider the example:

# Replication factor is set to 3, assignments = peers = ABC; the index with 
configuration ABC is 10.
# ABC nodes fail.
# The group is repaired, assignments = peers = CDE, with C as the leader.
# AB nodes recover.
# The replication factor changes to 5, assignments = ABCDE; the index with 
configuration ABCDE is 20.
# C replicates the log to AB up to index 10 and then fails.
# AB assume the configuration is ABC and start an election, electing A as 
leader before receiving vote requests from D or E.

As a result, A is elected leader, even though it is not the most up-to-date 
node in terms of the log (since CDE has a more advanced log). This violates 
Raft's invariants.

In this ticket we must reuse logic with setting fake conf when node receive 
data from new raft group (see {{NodeImpl#refreshLeadershipAbstaining}}) and 
logic of externally enforced config index (see 
{{RaftGroupOptions#externallyEnforcedConfigIndex()}})

h3. Definition of done 

* Nodes that join after partition majority reset must not elect a leader from 
the old majority that could hijack leadership and cause havoc in the repaired 
group. 


> resetPartitions improvements: leader hijack protection must be implemented
> --------------------------------------------------------------------------
>
>                 Key: IGNITE-23735
>                 URL: https://issues.apache.org/jira/browse/IGNITE-23735
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Mirza Aliev
>            Priority: Major
>              Labels: ignite-3
>
> h3. Motivation
> In https://issues.apache.org/jira/browse/IGNITE-22904 was implemented logic, 
> which prevents to leader hijack. More details could be found in the ticket 
> description, briefly, when node come back after majority reset, it might 
> still think it's a member of the voting set (judging by its local partition 
> Raft log), so it might propose itself as a candidate, and it can win the 
> election if there are enough such nodes. This will result in the leadership 
> being hijacked by the 'old' majority, which will mess the repaired partition 
> majority up.
> Let's consider the example:
> # Replication factor is set to 3, assignments = peers = ABC; the index with 
> configuration ABC is 10.
> # ABC nodes fail.
> # The group is repaired, assignments = peers = CDE, with C as the leader.
> # AB nodes recover.
> # The replication factor changes to 5, assignments = ABCDE; the index with 
> configuration ABCDE is 20.
> # C replicates the log to AB up to index 10 and then fails.
> # AB assume the configuration is ABC and start an election, electing A as 
> leader before receiving vote requests from D or E.
> As a result, A is elected leader, even though it is not the most up-to-date 
> node in terms of the log (since CDE has a more advanced log). This violates 
> Raft's invariants.
> In this ticket we must reuse logic with setting fake conf when node receive 
> data from new raft group (see {{NodeImpl#refreshLeadershipAbstaining}}) and 
> logic of externally enforced config index (see 
> {{RaftGroupOptions#externallyEnforcedConfigIndex()}}), so nodes that restart 
> and receive raft log won't try to elect leader untill all data is replicated. 
> h3. Definition of done 
> * Nodes that join after partition majority reset must not elect a leader from 
> the old majority that could hijack leadership and cause havoc in the repaired 
> group. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to