Xing Huang created KAFKA-2960:
---------------------------------

             Summary: DelayedProduce may cause message lose during repeatly 
leader change
                 Key: KAFKA-2960
                 URL: https://issues.apache.org/jira/browse/KAFKA-2960
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 0.9.0.0
            Reporter: Xing Huang


related to #KAFKA-1148
When a leader replica became follower then leader again, it may truncated its 
log as follower. But the second time it became leader, its ISR may shrink and 
if at this moment new messages were appended, the DelayedProduce generated when 
it was leader the first time may be satisfied, and the client will receive a 
response with no error. But, actually the messages were lost. 

We simulated this scene, which proved the message lose could happen. And it 
seems to be the reason for a data lose recently happened to us according to 
broker logs and client logs.

I think we should check the leader epoch when send a response, or satisfy 
DelayedProduce when leader change as described in #KAFKA-1148.

And we may need an new error code to inform the producer about this error. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to