[ 
https://issues.apache.org/jira/browse/KAFKA-16530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17849121#comment-17849121
 ] 

Alyssa Huang commented on KAFKA-16530:
--------------------------------------

In the case the leader is removed from the voter set, and tries to update its 
log end offset (`updateLocalState`) because of a new removeNode record for 
instance, it will first update its own ReplicaState (`getOrCreateReplicaState`) 
which will return a _new_ Observer state if its id is no longer in the 
`voterStates` map. The endOffset will be updated, and then we'll consider if 
the high watermark can be updated (`maybeUpdateHighWatermark`). 
When updating the high watermark, we only look at the `voterStates` map, which 
means we won't count the leader's offset as part of the HW calculation. This 
_does_ mean it's possible for the HW to drop though. Here's a scenario:


{code:java}
# Before node 1 removal, voterStates contains Nodes 1, 2, 3
Node 1: Leader, LEO 100
Node 2: Follower, LEO 90 <- HW
Node 3: Follower, LEO 85

# Leader processes removeNode record, voterStates contains Nodes 2, 3
Node 1: Leader, LEO 101
Node 2: Follower, LEO 90
Node 3: Follower, LEO 85 <- new HW{code}

We want to make sure the HW does not decrement in this scenario. Perhaps we 
could revise `maybeUpdateHighWatermark` to continue to factor in the Leader's 
offset into the HW calculation, regardless of if it is in the voter set or not.
e.g.
{code:java}
  private boolean maybeUpdateHighWatermark() {
    // Find the largest offset which is replicated to a majority of replicas 
(the leader counts)
-   List<ReplicaState> followersByDescendingFetchOffset = 
followersByDescendingFetchOffset();
+   List<ReplicaState> followersAndLeaderByDescFetchOffset = 
followersAndLeadersByDescFetchOffset();

-   int indexOfHw = voterStates.size() / 2;
+   int indexOfHw = followersByDescendingFetchOffset.size() / 2;
    Optional<LogOffsetMetadata> highWatermarkUpdateOpt = 
followersByDescendingFetchOffset.get(indexOfHw).endOffset;{code}

However, this does not cover the case when a follower is being removed from the 
voter set.

{code:java}
# Before node 2 removal, voterStates contains Nodes 1, 2, 3
Node 1: Leader, LEO 100
Node 2: Follower, LEO 90 <- HW
Node 3: Follower, LEO 85

# Leader processes removeNode record, voterStates contains Nodes 1, 3
Node 1: Leader, LEO 101
Node 2: Follower, LEO 90
Node 3: Follower, LEO 85 <- new HW{code}

> Fix high-watermark calculation to not assume the leader is in the voter set
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-16530
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16530
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: kraft
>            Reporter: José Armando García Sancio
>            Assignee: Alyssa Huang
>            Priority: Major
>             Fix For: 3.8.0
>
>
> When the leader is being removed from the voter set, the leader may not be in 
> the voter set. This means that kraft should not assume that the leader is 
> part of the high-watermark calculation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to