the kip-320 conflict can be resolved by saying that the leader broker on the destination "stamps" is own local leader epoch on the incoming msgs - meaning the offsets "transfer" but leader epochs do not.
On Mon, Jan 7, 2019 at 1:38 PM Edoardo Comar <eco...@uk.ibm.com> wrote: > > Hi, > I delayed starting the voting thread due to the festive period. I would > like to start it this week. > Has anyone any more feedback ? > > -------------------------------------------------- > > Edoardo Comar > > IBM Event Streams > > > Edoardo Comar <eco...@uk.ibm.com> wrote on 13/12/2018 17:50:30: > > > From: Edoardo Comar <eco...@uk.ibm.com> > > To: dev@kafka.apache.org > > Date: 13/12/2018 17:50 > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for > > Cluster Replication > > > > Hi, > > as we haven't got any more feedback, we'd like to start a vote on > KIP-391 > > on Monday > > > > INVALID URI REMOVED > > > u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D391-253A-2BAllow-2BProducing-2Bwith-2BOffsets-2Bfor-2BCluster-2BReplication&d=DwIFAg&c=jf_iaSHvJObTbx- > > > siA1ZOg&r=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=hxekG7cvm8Peoyd4oPqvSwRFRuGIyi9Pc_h2GhHbgtw&s=4SGyJsJAuYWZWADpzAaSEPqzYnde0WRW6XgZ3L4haB4&e= > > > > -------------------------------------------------- > > > > Edoardo Comar > > > > IBM Event Streams > > IBM UK Ltd, Hursley Park, SO21 2JN > > > > > > Edoardo Comar/UK/IBM wrote on 10/12/2018 10:20:06: > > > > > From: Edoardo Comar/UK/IBM > > > To: dev@kafka.apache.org > > > Date: 10/12/2018 10:20 > > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for > > > Cluster Replication > > > > > > (shameless bump) any additional feedback is welcome ... thanks! > > > > > > Edoardo Comar <eco...@uk.ibm.com> wrote on 27/11/2018 15:35:09: > > > > > > > From: Edoardo Comar <eco...@uk.ibm.com> > > > > To: dev@kafka.apache.org > > > > Date: 27/11/2018 15:35 > > > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with Offsets for > > > > Cluster Replication > > > > > > > > Hi Jason > > > > > > > > we envisioned the replicator to replicate the __consumer_offsets > topic > > too > > > > (although without producing-with-offsets to it!). > > > > > > > > As there is no client-side implementation yet using the leader > epoch, > > > > we could not yet see the impact of writing to the destination > cluster > > > > __consumer_offsets records with an invalid leader epoch. > > > > > > > > Also, applications might still use external storage mechanism for > > consumer > > > > offsets where the leader_epoch is missing. > > > > > > > > Perhaps the replicator could - for the __consumer_offsets topic - > just > > > > > > omit the leader_epoch field in the data sent to destination. > > > > > > > > What do you think ? > > > > > > > > > > > > Jason Gustafson <ja...@confluent.io> wrote on 27/11/2018 00:09:56: > > > > > > > > > Another wrinkle to consider is KIP-320. If you are planning to > > replicate > > > > > __consumer_offsets directly, then you will have to account for > > leader > > > > epoch > > > > > information which is stored with the committed offsets. But I > cannot > > > > > > think > > > > > how it would be possible to replicate the leader epoch information > > > in > > > > > messages even if you can preserve offsets. > > > > > > > > > > -Jason > > > > > > > > > > On Mon, Nov 26, 2018 at 1:16 PM Mayuresh Gharat > > > > <gharatmayures...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hi Edoardo, > > > > > > > > > > > > Thanks a lot for the KIP. > > > > > > I have a few questions/suggestions in addition to what Radai > has > > > > mentioned > > > > > > above : > > > > > > > > > > > > 1. Is this meant only for 1:1 replication, for example one > > Kafka > > > > cluster > > > > > > replicating to other, instead of having multiple Kafka > clusters > > > > > > mirroring > > > > > > into one Kafka cluster? > > > > > > 2. Are we relying on exactly once produce in the replicator? > If > > > > > > not, how > > > > > > are retries handled in the replicator ? > > > > > > 3. What is the recommended value for inflight requests, here. > > > Is it > > > > > > suppose to be strictly 1, if yes, it would be great to > mention > > that > > > > in > > > > > > the > > > > > > KIP. > > > > > > 4. How is unclean Leader election between source cluster and > > > > destination > > > > > > cluster handled? > > > > > > 5. How are offsets resets in case of the replicator's > consumer > > > > handled? > > > > > > 6. It would be good to explain the workflow in the KIP, with > an > > > > > > example, regarding how this KIP will change the replication > > > > scenario > > > > > > and > > > > > > how it will benefit the consumer apps. > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Mayuresh > > > > > > > > > > > > On Mon, Nov 26, 2018 at 8:08 AM radai > <radai.rosenbl...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > a few questions: > > > > > > > > > > > > > > 1. how do you handle possible duplications caused by the > > "special" > > > > > > > producer timing-out/retrying? are you explicitely relying on > the > > > > > > > "exactly once" sequencing? > > > > > > > 2. what about the combination of log compacted topics + > > replicator > > > > > > > downtime? by the time the replicator comes back up there might > > > be > > > > > > > "holes" in the source offsets (some msgs might have been > > compacted > > > > > > > out)? how is that recoverable? > > > > > > > 3. similarly, what if you try and fire up replication on a > > non-empty > > > > > > > source topic? does the kip allow for offsets starting at some > > > > > > > arbitrary X > 0 ? or would this have to be designed from the > > start. > > > > > > > > > > > > > > and lastly, since this KIP seems to be designed fro > > active-passive > > > > > > > failover (there can be no produce traffic except the > replicator) > > > > > > > wouldnt a solution based on seeking to a time offset be more > > > > generic? > > > > > > > your producers could checkpoint the last (say log append) > > timestamp > > > > of > > > > > > > records theyve seen, and when restoring in the remote site > seek > > to > > > > > > > those timestamps (which will be metadata in their committed > > offsets) > > > > - > > > > > > > assumming replication takes > 0 time you'd need to handle some > > > dups, > > > > > > > but every kafka consumer setup needs to know how to handle > those > > > > > > > anyway. > > > > > > > On Fri, Nov 23, 2018 at 2:27 AM Edoardo Comar > > <eco...@uk.ibm.com> > > > > wrote: > > > > > > > > > > > > > > > > Hi Stanislav > > > > > > > > > > > > > > > > > > The flag is needed to distinguish a batch with a desired > > > base > > > > > > offset > > > > > > > > of > > > > > > > > > 0, > > > > > > > > > from a regular batch for which offsets need to be > generated. > > > > > > > > > If the producer can provide offsets, why not provide a > base > > > > offset of > > > > > > > 0? > > > > > > > > > > > > > > > > a regular batch (for which offsets are generated by the > broker > > on > > > > > > write) > > > > > > > > is sent with a base offset of 0. > > > > > > > > How could you distinguish it from a batch where you *want* > the > > > > > > first > > > > > > > > record to be written at offset 0 (i.e. be the first in the > > > > partition > > > > > > and > > > > > > > > be rejected if there are records on the log already) ? > > > > > > > > We wanted to avoid a "deep" inspection (and potentially > > > > decompression) > > > > > > of > > > > > > > > the records. > > > > > > > > > > > > > > > > For the replicator use case, a single produce request where > > all > > > > the > > > > > > data > > > > > > > > is to be assumed with offset, > > > > > > > > or all without offsets, seems to suffice, > > > > > > > > So we added only a toplevel flag, not a per-topic-partition > > one. > > > > > > > > > > > > > > > > Thanks for your interest ! > > > > > > > > cheers > > > > > > > > Edo > > > > > > > > -------------------------------------------------- > > > > > > > > > > > > > > > > Edoardo Comar > > > > > > > > > > > > > > > > IBM Event Streams > > > > > > > > IBM UK Ltd, Hursley Park, SO21 2JN > > > > > > > > > > > > > > > > > > > > > > > > Stanislav Kozlovski <stanis...@confluent.io> wrote on > > 22/11/2018 > > > > > > > 22:32:42: > > > > > > > > > > > > > > > > > From: Stanislav Kozlovski <stanis...@confluent.io> > > > > > > > > > To: dev@kafka.apache.org > > > > > > > > > Date: 22/11/2018 22:33 > > > > > > > > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with > Offsets > > for > > > > > > > > > Cluster Replication > > > > > > > > > > > > > > > > > > Hey Edo & Mickael, > > > > > > > > > > > > > > > > > > > The flag is needed to distinguish a batch with a desired > > > base > > > > > > offset > > > > > > > > of > > > > > > > > > 0, > > > > > > > > > from a regular batch for which offsets need to be > generated. > > > > > > > > > If the producer can provide offsets, why not provide a > base > > > > offset of > > > > > > > 0? > > > > > > > > > > > > > > > > > > > (I am reading your post thinking about > > > > > > > > > partitions rather than topics). > > > > > > > > > Yes, I meant partitions. Sorry about that. > > > > > > > > > > > > > > > > > > Thanks for answering my questions :) > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Stanislav > > > > > > > > > > > > > > > > > > On Thu, Nov 22, 2018 at 5:28 PM Edoardo Comar > > > > <eco...@uk.ibm.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Stanislav, > > > > > > > > > > > > > > > > > > > > you're right we envision the replicator use case to have > a > > > > > > single > > > > > > > > producer > > > > > > > > > > with offsets per partition (I am reading your post > > thinking > > > > about > > > > > > > > > > partitions rather than topics). > > > > > > > > > > > > > > > > > > > > If a regular producer was to send its own records at the > > > same > > > > time, > > > > > > > > it's > > > > > > > > > > very likely that the one sending with an offset will > fail > > > > because > > > > > > of > > > > > > > > > > invalid offsets. > > > > > > > > > > Same if two producers were sending with offsets, likely > > both > > > > would > > > > > > > > then > > > > > > > > > > fail. > > > > > > > > > > > > > > > > > > > > > Does it make sense to *lock* the topic from other > > producers > > > > while > > > > > > > > there > > > > > > > > > > is > > > > > > > > > > > one that uses offsets? > > > > > > > > > > > > > > > > > > > > You could do that with ACL permissions if you wanted, I > > don't > > > > think > > > > > > > it > > > > > > > > > > needs to be mandated by changing the broker logic. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Since we are tying the produce-with-offset request to > > the > > > > ACL, do > > > > > > > we > > > > > > > > > > need > > > > > > > > > > > the `use_offset` field in the produce request? Maybe > we > > make > > > > it > > > > > > > > > > mandatory > > > > > > > > > > > for produce requests with that ACL to have offsets. > > > > > > > > > > > > > > > > > > > > The flag is needed to distinguish a batch with a desired > > > base > > > > > > offset > > > > > > > > of 0, > > > > > > > > > > from a regular batch for which offsets need to be > > generated. > > > > > > > > > > I would not restrict a principal to only > send-with-offsets > > (by > > > > > > making > > > > > > > > that > > > > > > > > > > mandatory via the ACL). > > > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > Edo & Mickael > > > > > > > > > > > > > > > > > > > > -------------------------------------------------- > > > > > > > > > > > > > > > > > > > > Edoardo Comar > > > > > > > > > > > > > > > > > > > > IBM Event Streams > > > > > > > > > > IBM UK Ltd, Hursley Park, SO21 2JN > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Stanislav Kozlovski <stanis...@confluent.io> wrote on > > > > 22/11/2018 > > > > > > > > 16:17:11: > > > > > > > > > > > > > > > > > > > > > From: Stanislav Kozlovski <stanis...@confluent.io> > > > > > > > > > > > To: dev@kafka.apache.org > > > > > > > > > > > Date: 22/11/2018 16:17 > > > > > > > > > > > Subject: Re: [DISCUSS] KIP-391: Allow Producing with > > Offsets > > > > for > > > > > > > > > > > Cluster Replication > > > > > > > > > > > > > > > > > > > > > > Hey Edurdo, thanks for the KIP! > > > > > > > > > > > > > > > > > > > > > > I have some questions, apologies if they are naive: > > > > > > > > > > > Is this intended to work for a single producer use > case > > > > only? > > > > > > > > > > > How would it work if two producers were producing to > the > > > > > > same > > > > > > topic > > > > > > > > with > > > > > > > > > > > offsets? > > > > > > > > > > > How would it work if two producers, one with offsets > and > > one > > > > > > > without > > > > > > > > > > were > > > > > > > > > > > producing to a topic? > > > > > > > > > > > Does it make sense to *lock* the topic from other > > producers > > > > while > > > > > > > > there > > > > > > > > > > is > > > > > > > > > > > one that uses offsets? > > > > > > > > > > > > > > > > > > > > > > Since we are tying the produce-with-offset request to > > the > > > > ACL, do > > > > > > > we > > > > > > > > > > need > > > > > > > > > > > the `use_offset` field in the produce request? Maybe > we > > make > > > > it > > > > > > > > > > mandatory > > > > > > > > > > > for produce requests with that ACL to have offsets. > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Stanislav > > > > > > > > > > > > > > > > > > > > > > On Wed, Nov 21, 2018 at 5:14 PM Edoardo Comar > > > > <eco...@uk.ibm.com > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > we've opened a KIP to improve data replication > between > > > > > > Kafka > > > > > > > > clusters > > > > > > > > > > : > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > INVALID URI REMOVED > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > u=https-3A__cwiki.apache.org_confluence_display_KAFKA_KIP-2D391-253A-2BAllow-2BProducing-2Bwith-2BOffsets-2Bfor-2BCluster-2BReplication&d=DwIBaQ&c=jf_iaSHvJObTbx- > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > siA1ZOg&r=EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=uUj9C3BdbYz0dDNA- > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > E6iXreg1M5hWiWgG6ClS86VIPI&s=Vav8_-N7_OpfYEW33yGOf_or8ESMUJ4S45t2g-EUWKg&e= > > > > > > > > > > > > > > > > > > > > > > > > We'd like to start a discussion, please post your > > feedback > > > > in > > > > > > > this > > > > > > > > > > thread. > > > > > > > > > > > > > > > > > > > > > > > > Thank you > > > > > > > > > > > > Edo and Mickael > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------------------------------------------- > > > > > > > > > > > > > > > > > > > > > > > > Edoardo Comar > > > > > > > > > > > > > > > > > > > > > > > > IBM Event Streams > > > > > > > > > > > > IBM UK Ltd, Hursley Park, SO21 2JN > > > > > > > > > > > > > > > > > > > > > > > > Unless stated otherwise above: > > > > > > > > > > > > IBM United Kingdom Limited - Registered in England > and > > > > > > Wales > > > > > > with > > > > > > > > > > number > > > > > > > > > > > > 741598. > > > > > > > > > > > > Registered office: PO Box 41, North Harbour, > > Portsmouth, > > > > > > > Hampshire > > > > > > > > PO6 > > > > > > > > > > 3AU > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > Best, > > > > > > > > > > > Stanislav > > > > > > > > > > > > > > > > > > > > Unless stated otherwise above: > > > > > > > > > > IBM United Kingdom Limited - Registered in England and > > Wales > > > > with > > > > > > > > number > > > > > > > > > > 741598. > > > > > > > > > > Registered office: PO Box 41, North Harbour, Portsmouth, > > > > > Hampshire > > > > > > > PO6 > > > > > > > > 3AU > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Best, > > > > > > > > > Stanislav > > > > > > > > > > > > > > > > Unless stated otherwise above: > > > > > > > > IBM United Kingdom Limited - Registered in England and Wales > > > with > > > > > > number > > > > > > > > 741598. > > > > > > > > Registered office: PO Box 41, North Harbour, Portsmouth, > > Hampshire > > > > PO6 > > > > > > > 3AU > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > -Regards, > > > > > > Mayuresh R. Gharat > > > > > > (862) 250-7125 > > > > > > > > > > > > > > Unless stated otherwise above: > > > > IBM United Kingdom Limited - Registered in England and Wales with > > number > > > > 741598. > > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire > PO6 > > 3AU > > > > > > Unless stated otherwise above: > > > IBM United Kingdom Limited - Registered in England and Wales with > > > number 741598. > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 > > > 3AU > > > > Unless stated otherwise above: > > IBM United Kingdom Limited - Registered in England and Wales with number > > > 741598. > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 > 3AU > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU