[ 
https://issues.apache.org/jira/browse/KAFKA-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15022117#comment-15022117
 ] 

jin xing commented on KAFKA-2334:
---------------------------------

after receiving LeaderAndIsrRequest, Broker B2 will finally call " 
Partition::makeLeader", part of code is as below:
...
     zkVersion = leaderAndIsr.zkVersion
      leaderReplicaIdOpt = Some(localBrokerId)
      // construct the high watermark metadata for the new leader replica
      val newLeaderReplica = getReplica().get
      newLeaderReplica.convertHWToLocalOffsetMetadata()
      // reset log end offset for remote replicas
      assignedReplicas.foreach(r => if (r.brokerId != localBrokerId) 
r.logEndOffset = LogOffsetMetadata.UnknownOffsetMetadata)
      // we may need to increment high watermark since ISR could be down to 1
      maybeIncrementLeaderHW(newLeaderReplica)
      if (topic == OffsetManager.OffsetsTopicName)
        offsetManager.loadOffsetsFromLog(partitionId)
...
I can tell Broker B2 will first set 'leaderReplicaIdOpt = Some(localBrokerId)', 
and then try to update high watermark;
by setting leaderReplicaIdOpt, Broker B2 will be available for consumer(if the 
consumer send fetchReqeust, there will be no NotLeaderForPartitionException);
In the short interval which after 'leaderReplicaIdOpt = Some(localBrokerId)' 
and before setting up hw, what the consumer get is the "gone back" hw;
If my understanding is wright, just reverse the order of setting up 
leaderReplicaIdOpt and updating high watermark will fix this issue;
am I wrong ?

> Prevent HW from going back during leader failover 
> --------------------------------------------------
>
>                 Key: KAFKA-2334
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2334
>             Project: Kafka
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.8.2.1
>            Reporter: Guozhang Wang
>            Assignee: Neha Narkhede
>             Fix For: 0.10.0.0
>
>
> Consider the following scenario:
> 0. Kafka use replication factor of 2, with broker B1 as the leader, and B2 as 
> the follower. 
> 1. A producer keep sending to Kafka with ack=-1.
> 2. A consumer repeat issuing ListOffset request to Kafka.
> And the following sequence:
> 0. B1 current log-end-offset (LEO) 0, HW-offset 0; and same with B2.
> 1. B1 receive a ProduceRequest of 100 messages, append to local log (LEO 
> becomes 100) and hold the request in purgatory.
> 2. B1 receive a FetchRequest starting at offset 0 from follower B2, and 
> returns the 100 messages.
> 3. B2 append its received message to local log (LEO becomes 100).
> 4. B1 receive another FetchRequest starting at offset 100 from B2, knowing 
> that B2's LEO has caught up to 100, and hence update its own HW, and 
> satisfying the ProduceRequest in purgatory, and sending the FetchResponse 
> with HW 100 back to B2 ASYNCHRONOUSLY.
> 5. B1 successfully sends the ProduceResponse to the producer, and then fails, 
> hence the FetchResponse did not reach B2, whose HW remains 0.
> From the consumer's point of view, it could first see the latest offset of 
> 100 (from B1), and then see the latest offset of 0 (from B2), and then the 
> latest offset gradually catch up to 100.
> This is because we use HW to guard the ListOffset and 
> Fetch-from-ordinary-consumer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to