The replica lag definition now is time based, so as long as a replica can catch up with leader in replica.lag.time.max.ms, it is in ISR, no matter how many messages it is behind.
And yes, your understanding is correct - ACK is sent back either when all replica in ISR got the message or the request timeout. I had some related slides here might help a bit. http://www.slideshare.net/JiangjieQin/no-data-loss-pipeline-with-apache-kaf ka-49753844 Thanks, Jiangjie (Becket) Qin On 7/7/15, 9:28 AM, "Stevo Slavić" <ssla...@gmail.com> wrote: >Thanks for heads up and code reference! > >Traced back required offset to >https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/serve >r/ReplicaManager.scala#L303 > >Have to investigate more, but from initial check was expecting to see >there >reference to "replica.lag.max.messages" (so even when replica is between 0 >and maxLagMessages behind to be considered on required offset to be >considered as insync). Searching through trunk cannot find where in main >code is "replica.lag.max.messages" configuration property used. > >Used search query >https://github.com/apache/kafka/search?utf8=%E2%9C%93&q=%22replica.lag.max >.messages%22&type=Code > >Maybe it's going to be removed in next release?! > >Time based lag is still there. > >Anyway, if I understood correctly, with request.required.acks=-1, when a >message/batch is published, it's first written to lead, then other >partition replicas either continuously poll and get in sync with lead, or >through zookeeper get notified that they are behind and poll and get in >sync with lead, and as soon as enough (min.insync.replicas - 1) replicas >are detected to be fully in sync with lead, ACK is sent to producer >(unless >timeout occurs first). > >On Tue, Jul 7, 2015 at 5:15 PM, Gwen Shapira <gshap...@cloudera.com> >wrote: > >> Ah, I think I see the confusion: Replicas don't actually ACK at all. >> What happens is that the replica manager waits for enough ISR replicas >> to reach the correct offset >> Partition.checkEnoughReplicasReachOffset(...) has this logic. A >> replica can't reach offset of second batch, without first having >> written the first batch. So I believe we are safe in this scenario. >> >> Gwen >> >> On Tue, Jul 7, 2015 at 8:01 AM, Stevo Slavić <ssla...@gmail.com> wrote: >> > Hello Gwen, >> > >> > Thanks for fast response! >> > >> > Btw, congrats on officially becoming a Kafka committer and thanks, >>among >> > other things, for great "Intro to Kafka" video >> > http://shop.oreilly.com/product/0636920038603.do ! >> > >> > Have to read more docs and/or source. I thought this scenario is >>possible >> > because replica can fall behind (replica.lag.max.messages) and still >>be >> > considered ISR. Then I assumed also write can be ACKed by any ISR, and >> then >> > why not by one which has fallen more behind. >> > >> > Kind regards, >> > Stevo Slavic. >> > >> > On Tue, Jul 7, 2015 at 4:47 PM, Gwen Shapira <gshap...@cloudera.com> >> wrote: >> > >> >> I am not sure "different replica" can ACK the second back of messages >> >> while not having the first - from what I can see, it will need to be >> >> up-to-date on the latest messages (i.e. correct HWM) in order to ACK. >> >> >> >> On Tue, Jul 7, 2015 at 7:13 AM, Stevo Slavić <ssla...@gmail.com> >>wrote: >> >> > Hello Apache Kafka community, >> >> > >> >> > Documentation for min.insync.replicas in >> >> > http://kafka.apache.org/documentation.html#brokerconfigs states: >> >> > >> >> > "When used together, min.insync.replicas and request.required.acks >> allow >> >> > you to enforce greater durability guarantees. A typical scenario >> would be >> >> > to create a topic with a replication factor of 3, set >> min.insync.replicas >> >> > to 2, and produce with request.required.acks of -1. This will >>ensure >> that >> >> > the producer raises an exception if a majority of replicas do not >> >> receive a >> >> > write." >> >> > >> >> > Correct me if wrong (doc reference?), I assume min.insync.replicas >> >> includes >> >> > lead, so with min.insync.replicas=2, lead and one more replica >>besides >> >> lead >> >> > will have to ACK writes. >> >> > >> >> > In such setup, with minimalistic 3 brokers cluster, given that >> >> > - all 3 replicas are insync >> >> > - a batch of messages is written and ends up on lead and one >>replica >> ACKs >> >> > - another batch of messages ends up on lead and different replica >>ACKs >> >> > >> >> > Is it possible that when lead crashes, while replicas didn't catch >>up, >> >> > (part of) one batch of messages could be lost (since one replica >> becomes >> >> a >> >> > new lead, and it's only serving all reads and requests, and >> replication >> >> is >> >> > one way)? >> >> > >> >> > Kind regards, >> >> > Stevo Slavic. >> >> >>