[ 
https://issues.apache.org/jira/browse/KAFKA-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419984#comment-15419984
 ] 

Flavio Junqueira commented on KAFKA-1211:
-----------------------------------------

[~junrao]

bq. the leader needs to first wait for the follower to receive a message before 
it can advance the last committed offset.

makes sense

bq. it can propagate the last committed offset to the follower

makes sense

bq. the last committed offset in the follower is always behind that in the 
leader

makes sense, it is either equal or behind, never ahead.

bq. Since the follower truncates based on the local last committed offset, it's 
possible for the follower to truncate messages that are already committed by 
the leader.

I'm not sure why we are doing this. A follower can't truncate until it hears 
from the leader upon recovery, it shouldn't truncate based on its local last 
committed offset.

> Hold the produce request with ack > 1 in purgatory until replicas' HW has 
> larger than the produce offset
> --------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1211
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1211
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Guozhang Wang
>            Assignee: Guozhang Wang
>             Fix For: 0.11.0.0
>
>
> Today during leader failover we will have a weakness period when the 
> followers truncate their data before fetching from the new leader, i.e., 
> number of in-sync replicas is just 1. If during this time the leader has also 
> failed then produce requests with ack >1 that have get responded will still 
> be lost. To avoid this scenario we would prefer to hold the produce request 
> in purgatory until replica's HW has larger than the offset instead of just 
> their end-of-log offsets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to