[ https://issues.apache.org/jira/browse/KAFKA-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15318799#comment-15318799 ]
Jun Rao commented on KAFKA-1211: -------------------------------- This is still an issue. It can cause data loss if the leader of a partition changes too quickly. This is less likely to happen with the fix in KAFKA-3670, but could still happen in theory. Fixing this is a bit involved since it would require the leader and the follower to keep track of and communicate additional information about leader generations, and may potentially require a change in message format. > Hold the produce request with ack > 1 in purgatory until replicas' HW has > larger than the produce offset > -------------------------------------------------------------------------------------------------------- > > Key: KAFKA-1211 > URL: https://issues.apache.org/jira/browse/KAFKA-1211 > Project: Kafka > Issue Type: Bug > Reporter: Guozhang Wang > Assignee: Guozhang Wang > Fix For: 0.10.1.0 > > > Today during leader failover we will have a weakness period when the > followers truncate their data before fetching from the new leader, i.e., > number of in-sync replicas is just 1. If during this time the leader has also > failed then produce requests with ack >1 that have get responded will still > be lost. To avoid this scenario we would prefer to hold the produce request > in purgatory until replica's HW has larger than the offset instead of just > their end-of-log offsets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)