Andiya,

On Wed, 15 Jan 2020 at 21:59, Anindya Haldar <anindya.hal...@oracle.com>
wrote:

> Okay, let’s say
>
> - the application is using a non-transactional producer, shared across
> multiple threads
> - the linger.ms and buffer.memory is non-zero, and so is batch.size such
> that messages are actually batched
> - the replication factor is 3
> - the minimum number of ISRs is 2
> - the parameter ack is set to ‘all’
>
> Now the application calls send(), get a future back, and then calls get()
> on the future. At some point (driven by the batching related parameters and
> a number of other factors) the get() call to the future returns
> successfully.
>
> Precisely at this point does Kafka guarantee that the message has been
> persisted to the leader’s and all the ISRs’ logs? By persisted, I mean
> written to the replication logs, but may or may not yet have been committed
> to the storage media by the fsync() call.
>
  At this stage the ISRs (this includes leader) are all acknowledged.

>
> If the answer is yes, it looks good from here. If the answer is no, then
> what else does the application need to do?
>
> Sincerely,
> Anindya Haldar
> Oracle Responsys
>
>
> > On Jan 15, 2020, at 12:31 PM, M. Manna <manme...@gmail.com> wrote:
> >
> > Hey Anindya,
> >
> >
> >
> > On Wed, 15 Jan 2020 at 18:23, Anindya Haldar <anindya.hal...@oracle.com>
> > wrote:
> >
> >> Thanks for the response.
> >>
> >> Essentially, we are looking for a confirmation that a send
> acknowledgement
> >> received at the client’s end will ensure the message is indeed
> persisted to
> >> the replication logs. We initially wondered whether the client has to
> make
> >> an explicit flush() call or whether it has to commit a producer
> transaction
> >> for that to happen. Based upon what I understand now from your
> response, a
> >> flush() or commitTransaction() call should not be necessary for this,
> and a
> >> send acknowledgement via the successful return from the get() call on
> the
> >> future will ensure the persistence of the message.
> >>
> >> Please feel free to correct me if I didn’t get it right.
> >>
> >
> > I'm sure you have done the reading, but to be in context of your
> question,
> > *commitTransaction()* is sufficient on it's own (see excerpt from
> *flush()*
> > doc below)
> >
> > *Applications don't need to call this method for transactional producers,
> >> since the commitTransaction() will flush all buffered records before
> >> performing the commit. This ensures that all the send(ProducerRecord)
> calls
> >> made since the previous beginTransaction() are completed before the
> >> commit. *
> >
> >
> >  But you *do *need to call commitTransaction() (for txn based producers),
> > or flush() (for normal cases) to send the records *immediately*.
> Otherwise,
> > they will be sent when the data buffer is full (re: buffer.memory and
> > linger.ms).
> >
> >  If you want to know more about transactions, there are some nice
> articles
> > regarding txn producers
> >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_KAFKA_Transactional-2BMessaging-2Bin-2BKafka&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=vmJiAMDGSxNeZnFztNs5ITB_i_Z3h3VtLPGma9y7cKI&m=qtRoal09Ax8f1wskhpGkLJz8loX98EAVCX95pMjnI8s&s=laTP9-1xOTyb1L9AFMVLYSlvZE-nfgJ7N4rsL3NyZvU&e=
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.confluent.io_blog_transactions-2Dapache-2Dkafka_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=vmJiAMDGSxNeZnFztNs5ITB_i_Z3h3VtLPGma9y7cKI&m=qtRoal09Ax8f1wskhpGkLJz8loX98EAVCX95pMjnI8s&s=SMCrXdI5TvfT6FEiqpQAA_8f8x8RA2MRFzrOKJmCFFc&e=
> >
> > Also, if you are interested to become more technical, please check the
> > codebase for KafkaProducer and see what doSend() and wakeup() is doing:
> >
> >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_kafka_blob_5c00191ea957fef425bf5dbbe47d70e41249e2d6_clients_src_main_java_org_apache_kafka_clients_producer_KafkaProducer.java-23L832&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=vmJiAMDGSxNeZnFztNs5ITB_i_Z3h3VtLPGma9y7cKI&m=qtRoal09Ax8f1wskhpGkLJz8loX98EAVCX95pMjnI8s&s=xruUiNP1BFXu6CziC0aB00HcoX7GyH8HNalyLp-CYlI&e=
> >
> > I hope this helps.
> >
> > Regards,
> >
> >>
> >> Sincerely,
> >> Anindya Haldar
> >> Oracle Responsys
> >>
> >>
> >>> On Jan 15, 2020, at 8:55 AM, M. Manna <manme...@gmail.com> wrote:
> >>>
> >>> Anindya,
> >>>
> >>> On Wed, 15 Jan 2020 at 16:49, Anindya Haldar <
> anindya.hal...@oracle.com>
> >>> wrote:
> >>>
> >>>> In our case, the minimum in-sync replicas is set to 2.
> >>>>
> >>>> Given that, what will be expected behavior for the scenario I
> outlined?
> >>>>
> >>>
> >>> This means you will get confirmation when 2 of them have acknowledged.
> so
> >>> you will always have 2 in-sync.
> >>>
> >>> Perhaps drilling each detail and having a long thread, you could
> explain
> >>> what is it you are trying to investigate/identify? We will be happy to
> >> help.
> >>>
> >>> Regards,
> >>>
> >>>
> >>>> Sincerely,
> >>>> Anindya Haldar
> >>>> Oracle Responsys
> >>>>
> >>>>
> >>>>> On Jan 15, 2020, at 6:38 AM, Ismael Juma <ism...@juma.me.uk> wrote:
> >>>>>
> >>>>> To all the in-sync replicas. You can set the minimum number of
> in-sync
> >>>>> replicas via the min.insync.replicas topic/broker config.
> >>>>>
> >>>>> Ismael
> >>>>>
> >>>>> On Tue, Jan 14, 2020 at 11:11 AM Anindya Haldar <
> >>>> anindya.hal...@oracle.com>
> >>>>> wrote:
> >>>>>
> >>>>>> I have a question related to the semantics of a producer send and
> the
> >>>> get
> >>>>>> calls on the future returned by the send call.
> >>>>>>
> >>>>>> - It is a Java application, using the Kafka Java client library
> >>>>>> - The application is set up to use 3 replicas and using acks=all for
> >> the
> >>>>>> producer
> >>>>>> - the application is using a non-zero value for linger.ms and
> >>>> batch.size
> >>>>>> parameters
> >>>>>> - The application is using a single non-transactional Kafka producer
> >>>>>> instance, shared across a number of threads
> >>>>>>
> >>>>>> With that,
> >>>>>>
> >>>>>> - Any application thread makes a send() call on the producer.
> >>>>>> - Then the same thread calls get() on the future returned by the
> last
> >>>>>> send() call
> >>>>>> - The get() call on the future returns after it gets the
> >> acknowledgement
> >>>>>> from the system for the message send
> >>>>>>
> >>>>>> At this point, is it guaranteed that the message has actually been
> >>>> written
> >>>>>> (but may not be committed by calling fsync) to ALL of the replicas’
> >>>>>> filesystems?
> >>>>>>
> >>>>>> Sincerely,
> >>>>>> Anindya Haldar
> >>>>>> Oracle Responsys
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Reply via email to