[
https://issues.apache.org/jira/browse/KAFKA-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rajini Sivaram resolved KAFKA-13141.
------------------------------------
Reviewer: Jason Gustafson
Resolution: Fixed
> Leader should not update follower fetch offset if diverging epoch is present
> ----------------------------------------------------------------------------
>
> Key: KAFKA-13141
> URL: https://issues.apache.org/jira/browse/KAFKA-13141
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 2.8.0, 2.7.1
> Reporter: Jason Gustafson
> Assignee: Rajini Sivaram
> Priority: Blocker
> Fix For: 3.0.0, 2.7.2, 2.8.1
>
>
> In 2.7, we began doing fetcher truncation piggybacked on the Fetch protocol
> instead of using the old OffsetsForLeaderEpoch API. When truncation is
> detected, we return a `divergingEpoch` field in the Fetch response, but we do
> not set an error code. The sender is expected to check if the diverging epoch
> is present and truncate accordingly.
> All of this works correctly in the fetcher implementation, but the problem is
> that the logic to update the follower fetch position on the leader does not
> take into account the diverging epoch present in the response. This means the
> fetch offsets can be updated incorrectly, which can lead to either log
> divergence or the loss of committed data.
> For example, we hit the following case with 3 replicas. Leader 1 is elected
> in epoch 1 with an end offset of 100. The followers are at offset 101
> Broker 1: (Leader) Epoch 1 from offset 100
> Broker 2: (Follower) Epoch 1 from offset 101
> Broker 3: (Follower) Epoch 1 from offset 101
> Broker 1 receives fetches from 2 and 3 at offset 101. The leader detects the
> divergence and returns a diverging epoch in the fetch state. Nevertheless,
> the fetch positions for both followers are updated to 101 and the high
> watermark is advanced.
> After brokers 2 and 3 had truncated to offset 100, broker 1 experienced a
> network partition of some kind and was kicked from the ISR. This caused
> broker 2 to get elected, which resulted in the following state at the start
> of epoch 2.
> Broker 1: (Follower) Epoch 2 from offset 101
> Broker 2: (Leader) Epoch 2 from offset 100
> Broker 3: (Follower) Epoch 2 from offset 100
> Broker 2 was then able to write a new entry at offset 100 and the old record
> which may have been exposed to consumers was deleted by broker 1.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)