Arun Mathew created KAFKA-3643: ---------------------------------- Summary: Data Duplication on clean restart of Kafka Broker Key: KAFKA-3643 URL: https://issues.apache.org/jira/browse/KAFKA-3643 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.9.0.1 Reporter: Arun Mathew
We observed event duplication while partition leadership is restored back to preferred leader from the new leader upon restart of the preferred leader. Steps to Reproduce - Three Broker Kafka Cluster (B1, B2, B3) - Create a topic with 3 replica and 1 partition. - [B1 is assigned the (preferred) Leader, B2, B3 are ISR] - Start sending events using performance producer for large number of events that can last for few minutes to cover the broker restart time interval (say 4Million) - set producer batch size = 1 - Clean shutdown Leader Broker B1 - Event sending continues - Now, B2 is the new Leader and B3 is ISR. - Restart the Broker B1 (preferred leader for Partition 0) - The replica in B1 catches up and becomes the Leader for P-0 - Wait for producer to finish - Use get offset command to get the event count in Partition, which is higher than events sent (4M) -- This message was sent by Atlassian JIRA (v6.3.4#6332)