[ 
https://issues.apache.org/jira/browse/KAFKA-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264820#comment-15264820
 ] 

Arun Mathew commented on KAFKA-3643:
------------------------------------

Further Details of the Issue.

When an event is received by the broker who is the leader of the partition, it 
is appended to the log at
https://github.com/apache/kafka/blob/4f22705c7d0c8e8cab68883e76f554439341e34a/core/src/main/scala/kafka/server/ReplicaManager.scala#L328

Which writes the event to the leader broker log replica, after first checking 
if the broker is indeed the leader replica at
https://github.com/apache/kafka/blob/4f22705c7d0c8e8cab68883e76f554439341e34a/core/src/main/scala/kafka/cluster/Partition.scala#L430

This goes through fine, but since we have set acks = all from the producer, a 
DelayedProduce request is created and added to delayedProducePurgatory, to keep 
track of ISRs catching up, so as to ack the producer for the received event.

But, in our experiment there is a leadership change (due to broker restart) for 
the parition in this meantime, and the DelayedProduce request which 
periodically checks for completion fails at 
https://github.com/apache/kafka/blob/4f22705c7d0c8e8cab68883e76f554439341e34a/core/src/main/scala/kafka/cluster/Partition.scala#L305
where the tryCompleteDelayedProduce() checks if the broker is still the leader.

This causes causes the ProduceRequest to be negatively acknowledged with a 
NOT_LEADER_FOR_PARTITION error, even though the replica's might have correctly 
replicated the event.
Also nothing is done to roll back the event committed to the local log, while 
the broker was still leader for partition.

The producer then retries the event to the current Leader broker and it goes 
through correctly, unaware that the previous try was also committed and 
replicated by all replicas in the partition.


> Data Duplication on clean restart of Kafka Broker
> -------------------------------------------------
>
>                 Key: KAFKA-3643
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3643
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.9.0.1
>            Reporter: Arun Mathew
>
> We observed event duplication while partition leadership is restored back to 
> preferred leader from the new leader upon restart of the preferred leader.
> Steps to Reproduce
> - Three Broker Kafka Cluster (B1, B2, B3)
> - Create a topic with 3 replica and 1 partition. 
>       - [B1 is assigned the (preferred) Leader, B2, B3 are ISR]
> - Start sending events using performance producer for large number of events 
> that can last for few minutes to cover the broker restart time interval (say 
> 4Million)
>       - set producer batch size = 1
> - Clean shutdown Leader Broker B1
>       - Event sending continues
>       - Now, B2 is the new Leader and B3 is ISR.
> - Restart the Broker B1 (preferred leader for Partition 0)
>       - The replica in B1 catches up and becomes the Leader for P-0
> - Wait for producer to finish
> - Use get offset command to get the event count in Partition, which is higher 
> than events sent (4M)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to