Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional Messaging

Jay Kreps Fri, 16 Dec 2016 16:57:20 -0800

Yeah good point. I relent!

-jay


On Fri, Dec 16, 2016 at 1:46 PM Jason Gustafson <ja...@confluent.io> wrote:

> Jay/Ismael,
>
>
>
> I agree that lazy initialization of metadata seems unavoidable. Ideally, we
>
> could follow the same pattern for transactions, but remember that in the
>
> consumer+producer use case, the initialization needs to be completed prior
>
> to setting the consumer's position. Otherwise we risk reading stale
>
> offsets. But it would be pretty awkward if you have to begin a transaction
>
> first to ensure that your consumer can read the right offset from the
>
> consumer, right? It's a bit easier to explain that you should always call
>
> `producer.init()` prior to initializing the consumer. Users would probably
>
> get this right without any special effort.
>
>
>
> -Jason
>
>
>
> On Wed, Dec 14, 2016 at 1:52 AM, Rajini Sivaram <rsiva...@pivotal.io>
> wrote:
>
>
>
> > Hi Apurva,
>
> >
>
> > Thank you for the answers. Just one follow-on.
>
> >
>
> > 15. Let me rephrase my original question. If all control messages
> (messages
>
> > to transaction logs and markers on user logs) were acknowledged only
> after
>
> > flushing the log segment, will transactions become durable in the
>
> > traditional sense (i.e. not restricted to min.insync.replicas failures) ?
>
> > This is not a suggestion to update the KIP. It seems to me that the
> design
>
> > enables full durability if required in the future with a rather
>
> > non-intrusive change. I just wanted to make sure I haven't missed
> anything
>
> > fundamental that prevents Kafka from doing this.
>
> >
>
> >
>
> >
>
> > On Wed, Dec 14, 2016 at 5:30 AM, Ben Kirwin <b...@kirw.in> wrote:
>
> >
>
> > > Hi Apurva,
>
> > >
>
> > > Thanks for the detailed answers... and sorry for the late reply!
>
> > >
>
> > > It does sound like, if the input-partitions-to-app-id mapping never
>
> > > changes, the existing fencing mechanisms should prevent duplicates.
>
> > Great!
>
> > > I'm a bit concerned the proposed API will be delicate to program
> against
>
> > > successfully -- even in the simple case, we need to create a new
> producer
>
> > > instance per input partition, and anything fancier is going to need its
>
> > own
>
> > > implementation of the Streams/Samza-style 'task' idea -- but that may
> be
>
> > > fine for this sort of advanced feature.
>
> > >
>
> > > For the second question, I notice that Jason also elaborated on this
>
> > > downthread:
>
> > >
>
> > > > We also looked at removing the producer ID.
>
> > > > This was discussed somewhere above, but basically the idea is to
> store
>
> > > the
>
> > > > AppID in the message set header directly and avoid the mapping to
>
> > > producer
>
> > > > ID altogether. As long as batching isn't too bad, the impact on total
>
> > > size
>
> > > > may not be too bad, but we were ultimately more comfortable with a
>
> > fixed
>
> > > > size ID.
>
> > >
>
> > > ...which suggests that the distinction is useful for performance, but
> not
>
> > > necessary for correctness, which makes good sense to me. (Would a
> 128-bid
>
> > > ID be a reasonable compromise? That's enough room for a UUID, or a
>
> > > reasonable hash of an arbitrary string, and has only a marginal
> increase
>
> > on
>
> > > the message size.)
>
> > >
>
> > > On Tue, Dec 6, 2016 at 11:44 PM, Apurva Mehta <apu...@confluent.io>
>
> > wrote:
>
> > >
>
> > > > Hi Ben,
>
> > > >
>
> > > > Now, on to your first question of how deal with consumer rebalances.
>
> > The
>
> > > > short answer is that the application needs to ensure that the the
>
> > > > assignment of input partitions to appId is consistent across
>
> > rebalances.
>
> > > >
>
> > > > For Kafka streams, they already ensure that the mapping of input
>
> > > partitions
>
> > > > to task Id is invariant across rebalances by implementing a custom
>
> > sticky
>
> > > > assignor. Other non-streams apps can trivially have one producer per
>
> > > input
>
> > > > partition and have the appId be the same as the partition number to
>
> > > achieve
>
> > > > the same effect.
>
> > > >
>
> > > > With this precondition in place, we can maintain transactions across
>
> > > > rebalances.
>
> > > >
>
> > > > Hope this answers your question.
>
> > > >
>
> > > > Thanks,
>
> > > > Apurva
>
> > > >
>
> > > > On Tue, Dec 6, 2016 at 3:22 PM, Ben Kirwin <b...@kirw.in> wrote:
>
> > > >
>
> > > > > Thanks for this! I'm looking forward to going through the full
>
> > proposal
>
> > > > in
>
> > > > > detail soon; a few early questions:
>
> > > > >
>
> > > > > First: what happens when a consumer rebalances in the middle of a
>
> > > > > transaction? The full documentation suggests that such a
> transaction
>
> > > > ought
>
> > > > > to be rejected:
>
> > > > >
>
> > > > > > [...] if a rebalance has happened and this consumer
>
> > > > > > instance becomes a zombie, even if this offset message is
> appended
>
> > in
>
> > > > the
>
> > > > > > offset topic, the transaction will be rejected later on when it
>
> > tries
>
> > > > to
>
> > > > > > commit the transaction via the EndTxnRequest.
>
> > > > >
>
> > > > > ...but it's unclear to me how we ensure that a transaction can't
>
> > > complete
>
> > > > > if a rebalance has happened. (It's quite possible I'm missing
>
> > something
>
> > > > > obvious!)
>
> > > > >
>
> > > > > As a concrete example: suppose a process with PID 1 adds offsets
> for
>
> > > some
>
> > > > > partition to a transaction; a consumer rebalance happens that
> assigns
>
> > > the
>
> > > > > partition to a process with PID 2, which adds some offsets to its
>
> > > current
>
> > > > > transaction; both processes try and commit. Allowing both commits
>
> > would
>
> > > > > cause the messages to be processed twice -- how is that avoided?
>
> > > > >
>
> > > > > Second: App IDs normally map to a single PID. It seems like one
> could
>
> > > do
>
> > > > > away with the PID concept entirely, and just use App IDs in most
>
> > places
>
> > > > > that require a PID. This feels like it would be significantly
>
> > simpler,
>
> > > > > though it does increase the message size. Are there other reasons
> why
>
> > > the
>
> > > > > App ID / PID split is necessary?
>
> > > > >
>
> > > > > On Wed, Nov 30, 2016 at 2:19 PM, Guozhang Wang <wangg...@gmail.com
> >
>
> > > > wrote:
>
> > > > >
>
> > > > > > Hi all,
>
> > > > > >
>
> > > > > > I have just created KIP-98 to enhance Kafka with exactly once
>
> > > delivery
>
> > > > > > semantics:
>
> > > > > >
>
> > > > > > *https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>
> > > > > > 98+-+Exactly+Once+Delivery+and+Transactional+Messaging
>
> > > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>
> > > > > > 98+-+Exactly+Once+Delivery+and+Transactional+Messaging>*
>
> > > > > >
>
> > > > > > This KIP adds a transactional messaging mechanism along with an
>
> > > > > idempotent
>
> > > > > > producer implementation to make sure that 1) duplicated messages
>
> > sent
>
> > > > > from
>
> > > > > > the same identified producer can be detected on the broker side,
>
> > and
>
> > > > 2) a
>
> > > > > > group of messages sent within a transaction will atomically be
>
> > either
>
> > > > > > reflected and fetchable to consumers or not as a whole.
>
> > > > > >
>
> > > > > > The above wiki page provides a high-level view of the proposed
>
> > > changes
>
> > > > as
>
> > > > > > well as summarized guarantees. Initial draft of the detailed
>
> > > > > implementation
>
> > > > > > design is described in this Google doc:
>
> > > > > >
>
> > > > > > https://docs.google.com/document/d/11Jqy_
>
> > > > GjUGtdXJK94XGsEIK7CP1SnQGdp2eF
>
> > > > > > 0wSw9ra8
>
> > > > > >
>
> > > > > >
>
> > > > > > We would love to hear your comments and suggestions.
>
> > > > > >
>
> > > > > > Thanks,
>
> > > > > >
>
> > > > > > -- Guozhang
>
> > > > > >
>
> > > > >
>
> > > >
>
> > >
>
> >
>
>

Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional Messaging

Reply via email to