Hi, Justine,

Thanks for the reply.

101.3 I realized that I actually have two questions.
(1) In the non-overflow case, we need to write the previous produce Id
tagged field in the end maker so that we know if the marker is from the new
client. Since the end maker is derived from the prepare marker, should we
write the previous produce Id in the prepare marker field too? Otherwise,
we will lose this information when deriving the end marker.
(2) In the prepare phase, if we retry and see epoch - 1 + ID in last seen
fields and are issuing the same command (ie commit not abort), we return
success. The logic before KIP-890 seems to return CONCURRENT_TRANSACTIONS
in this case. Are we intentionally making this change?

112. We already merged the code that adds the VerifyOnly field in
AddPartitionsToTxnRequest, which is an inter broker request. It seems that
we didn't bump up the IBP for that. Do you know why?

Jun

On Fri, Jan 19, 2024 at 4:50 PM Justine Olshan <jols...@confluent.io.invalid>
wrote:

> Hi Jun,
>
> 101.3 I can change "last seen" to "current producer id and epoch" if that
> was the part that was confusing
> 110 I can mention this
> 111 I can do that
> 112 We still need it. But I am still finalizing the design. I will update
> the KIP once I get the information finalized. Sorry for the delays.
>
> Justine
>
> On Fri, Jan 19, 2024 at 10:50 AM Jun Rao <j...@confluent.io.invalid> wrote:
>
> > Hi, Justine,
> >
> > Thanks for the reply.
> >
> > 101.3 In the non-overflow case, the previous ID is the same as the
> produce
> > ID for the complete marker too, but we set the previous ID in the
> complete
> > marker. Earlier you mentioned that this is to know that the marker is
> > written by the new client so that we could return success on retried
> > endMarker requests. I was trying to understand why this is not needed for
> > the prepare marker since retry can happen in the prepare state too. Is
> the
> > reason that in the prepare state, we return CONCURRENT_TRANSACTIONS
> instead
> > of success on retried endMaker requests? If so, should we change "If we
> > retry and see epoch - 1 + ID in last seen fields and are issuing the same
> > command (ie commit not abort) we can return (with the new epoch)"
> > accordingly?
> >
> > 110. Yes, without this KIP, a delayed endMaker request carries the same
> > epoch and won't be fenced. This can commit/abort a future transaction
> > unexpectedly. I am not sure if we have seen this in practice though.
> >
> > 111. Sounds good. It would be useful to make it clear that we can now
> > populate the lastSeen field from the log reliably.
> >
> > 112. Yes, I was referring to AddPartitionsToTxnRequest since it's called
> > across brokers and we are changing its schema. Are you saying we don't
> need
> > it any more? I thought that we already implemented the server side
> > verification logic based on AddPartitionsToTxnRequest across brokers.
> >
> > Jun
> >
> >
> > On Thu, Jan 18, 2024 at 5:05 PM Justine Olshan
> > <jols...@confluent.io.invalid>
> > wrote:
> >
> > > Hey Jun,
> > >
> > > 101.3 We don't set the previous ID in the Prepare field since we don't
> > need
> > > it. It is the same producer ID as the main producer ID field.
> > >
> > > 110 Hmm -- maybe I need to reread your message about delayed markers.
> If
> > we
> > > receive a delayed endTxn marker after the transaction is already
> > complete?
> > > So we will commit the next transaction early without the fixes in part
> 2?
> > >
> > > 111 Yes -- this terminology was used in a previous KIP and never
> > > implemented it in the log -- only in memory
> > >
> > > 112 Hmm -- which interbroker protocol are you referring to? I am
> working
> > on
> > > the design for the work to remove the extra add partitions call and I
> > right
> > > now the design bumps MV. I have yet to update that section as I
> finalize
> > > the design so please stay tuned. Was there anything else you thought
> > needed
> > > MV bump?
> > >
> > > Justine
> > >
> > > On Thu, Jan 18, 2024 at 3:07 PM Jun Rao <j...@confluent.io.invalid>
> > wrote:
> > >
> > > > Hi, Justine,
> > > >
> > > > I don't see this create any issue. It just makes it a bit hard to
> > explain
> > > > what this non-tagged produce id field means. We are essentially
> trying
> > to
> > > > combine two actions (completing a txn and init a new produce Id) in a
> > > > single record. But, this may be fine too.
> > > >
> > > > A few other follow up comments.
> > > >
> > > > 101.3 I guess the reason that we only set the previous produce id
> > tagged
> > > > field in the complete marker, but not in the prepare marker, is that
> in
> > > the
> > > > prepare state, we always return CONCURRENT_TRANSACTIONS on retried
> > > endMaker
> > > > requests?
> > > >
> > > > 110. "I believe your second point is mentioned in the KIP. I can add
> > more
> > > > text on
> > > > this if it is helpful.
> > > > > The delayed message case can also violate EOS if the delayed
> message
> > > > comes in after the next addPartitionsToTxn request comes in.
> > Effectively
> > > we
> > > > may see a message from a previous (aborted) transaction become part
> of
> > > the
> > > > next transaction."
> > > >
> > > > The above is the case when a delayed message is appended to the data
> > > > partition. What I mentioned is a slightly different case when a
> delayed
> > > > marker is appended to the transaction log partition.
> > > >
> > > > 111. The KIP says "Once we move past the Prepare and Complete states,
> > we
> > > > don’t need to worry about lastSeen fields and clear them, just handle
> > > state
> > > > transitions as normal.". Is the lastSeen field the same as the
> previous
> > > > Produce Id tagged field in TransactionLogValue?
> > > >
> > > > 112. Since the kip changes the inter-broker protocol, should we bump
> up
> > > the
> > > > MV/IBP version? Is this feature only for the KRaft mode?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Wed, Jan 17, 2024 at 11:13 AM Justine Olshan
> > > > <jols...@confluent.io.invalid> wrote:
> > > >
> > > > > Hey Jun,
> > > > >
> > > > > I'm glad we are getting to convergence on the design. :)
> > > > >
> > > > > While I understand it seems a little "weird". I'm not sure what the
> > > > benefit
> > > > > of writing an extra record to the log.
> > > > > Is the concern a tool to describe transactions won't work (ie, the
> > > > complete
> > > > > state is needed to calculate the time since the transaction
> > completed?)
> > > > > If we have a reason like this, it is enough to convince me we need
> > such
> > > > an
> > > > > extra record. It seems like it would be replacing the record
> written
> > on
> > > > > InitProducerId. Is this correct?
> > > > >
> > > > > Thanks,
> > > > > Justine
> > > > >
> > > > > On Tue, Jan 16, 2024 at 5:14 PM Jun Rao <j...@confluent.io.invalid>
> > > > wrote:
> > > > >
> > > > > > Hi, Justine,
> > > > > >
> > > > > > Thanks for the explanation. I understand the intention now. In
> the
> > > > > overflow
> > > > > > case, we set the non-tagged field to the old pid (and the max
> > epoch)
> > > in
> > > > > the
> > > > > > prepare marker so that we could correctly write the marker to the
> > > data
> > > > > > partition if the broker downgrades. When writing the complete
> > marker,
> > > > we
> > > > > > know the marker has already been written to the data partition.
> We
> > > set
> > > > > the
> > > > > > non-tagged field to the new pid to avoid
> InvalidPidMappingException
> > > in
> > > > > the
> > > > > > client if the broker downgrades.
> > > > > >
> > > > > > The above seems to work. It's just a bit inconsistent for a
> prepare
> > > > > marker
> > > > > > and a complete marker to use different pids in this special case.
> > If
> > > we
> > > > > > downgrade with the complete marker, it seems that we will never
> be
> > > able
> > > > > to
> > > > > > write the complete marker with the old pid. Not sure if it causes
> > any
> > > > > > issue, but it seems a bit weird. Instead of writing the complete
> > > marker
> > > > > > with the new pid, could we write two records: a complete marker
> > with
> > > > the
> > > > > > old pid followed by a TransactionLogValue with the new pid and an
> > > empty
> > > > > > state? We could make the two records in the same batch so that
> they
> > > > will
> > > > > be
> > > > > > added to the log atomically.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > > On Fri, Jan 12, 2024 at 5:40 PM Justine Olshan
> > > > > > <jols...@confluent.io.invalid>
> > > > > > wrote:
> > > > > >
> > > > > > > (1) the prepare marker is written, but the endTxn response is
> not
> > > > > > received
> > > > > > > by the client when the server downgrades
> > > > > > > (2)  the prepare marker is written, the endTxn response is
> > received
> > > > by
> > > > > > the
> > > > > > > client when the server downgrades.
> > > > > > >
> > > > > > > I think I am still a little confused. In both of these cases,
> the
> > > > > > > transaction log has the old producer ID. We don't write the new
> > > > > producer
> > > > > > ID
> > > > > > > in the prepare marker's non tagged fields.
> > > > > > > If the server downgrades now, it would read the records not in
> > > tagged
> > > > > > > fields and the complete marker will also have the old producer
> > ID.
> > > > > > > (If we had used the new producer ID, we would not have
> > > transactional
> > > > > > > correctness since the producer id doesn't match the transaction
> > and
> > > > the
> > > > > > > state would not be correct on the data partition.)
> > > > > > >
> > > > > > > In the overflow case, I'd expect the following to happen on the
> > > > client
> > > > > > side
> > > > > > > Case 1  -- we retry EndTxn -- it is the same producer ID and
> > epoch
> > > -
> > > > 1
> > > > > > this
> > > > > > > would fence the producer
> > > > > > > Case 2 -- we don't retry EndTxn and use the new producer id
> which
> > > > would
> > > > > > > result in InvalidPidMappingException
> > > > > > >
> > > > > > > Maybe we can have special handling for when a server
> downgrades.
> > > When
> > > > > it
> > > > > > > reconnects we could get an API version request showing KIP-890
> > > part 2
> > > > > is
> > > > > > > not supported. In that case, we can call initProducerId to
> abort
> > > the
> > > > > > > transaction. (In the overflow case, this correctly gives us a
> new
> > > > > > producer
> > > > > > > ID)
> > > > > > >
> > > > > > > I guess the corresponding case would be where the *complete
> > marker
> > > > *is
> > > > > > > written but the endTxn is not received by the client and the
> > server
> > > > > > > downgrades? This would result in the transaction coordinator
> > having
> > > > the
> > > > > > new
> > > > > > > ID and not the old one.  If the client retries, it will receive
> > an
> > > > > > > InvalidPidMappingException. The InitProducerId scenario above
> > would
> > > > > help
> > > > > > > here too.
> > > > > > >
> > > > > > > To be clear, my compatibility story is meant to support
> > downgrades
> > > > > server
> > > > > > > side in keeping the transactional correctness. Keeping the
> client
> > > > from
> > > > > > > fencing itself is not the priority.
> > > > > > >
> > > > > > > Hope this helps. I can also add text in the KIP about
> > > InitProducerId
> > > > if
> > > > > > we
> > > > > > > think that fixes some edge cases.
> > > > > > >
> > > > > > > Justine
> > > > > > >
> > > > > > > On Fri, Jan 12, 2024 at 4:10 PM Jun Rao
> <j...@confluent.io.invalid
> > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi, Justine,
> > > > > > > >
> > > > > > > > Thanks for the reply.
> > > > > > > >
> > > > > > > > I agree that we don't need to optimize for fencing during
> > > > downgrades.
> > > > > > > > Regarding consistency, there are two possible cases: (1) the
> > > > prepare
> > > > > > > marker
> > > > > > > > is written, but the endTxn response is not received by the
> > client
> > > > > when
> > > > > > > the
> > > > > > > > server downgrades; (2)  the prepare marker is written, the
> > endTxn
> > > > > > > response
> > > > > > > > is received by the client when the server downgrades. In (1),
> > the
> > > > > > client
> > > > > > > > will have the old produce Id and in (2), the client will have
> > the
> > > > new
> > > > > > > > produce Id. If we downgrade right after the prepare marker,
> we
> > > > can't
> > > > > be
> > > > > > > > consistent to both (1) and (2) since we can only put one
> value
> > in
> > > > the
> > > > > > > > existing produce Id field. It's also not clear which case is
> > more
> > > > > > likely.
> > > > > > > > So we could probably be consistent with either case. By
> putting
> > > the
> > > > > new
> > > > > > > > producer Id in the prepare marker, we are consistent with
> case
> > > (2)
> > > > > and
> > > > > > it
> > > > > > > > also has the slight benefit that the produce field in the
> > prepare
> > > > and
> > > > > > > > complete marker are consistent in the overflow case.
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Fri, Jan 12, 2024 at 3:11 PM Justine Olshan
> > > > > > > > <jols...@confluent.io.invalid>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Jun,
> > > > > > > > >
> > > > > > > > > In the case you describe, we would need to have a delayed
> > > > request,
> > > > > > > send a
> > > > > > > > > successful EndTxn, and a successful AddPartitionsToTxn and
> > then
> > > > > have
> > > > > > > the
> > > > > > > > > delayed EndTxn request go through for a given producer.
> > > > > > > > > I'm trying to figure out if it is possible for the client
> to
> > > > > > transition
> > > > > > > > if
> > > > > > > > > a previous request is delayed somewhere. But yes, in this
> > case
> > > I
> > > > > > think
> > > > > > > we
> > > > > > > > > would fence the client.
> > > > > > > > >
> > > > > > > > > Not for the overflow case. In the overflow case, the
> producer
> > > ID
> > > > > and
> > > > > > > the
> > > > > > > > > epoch are different on the marker and on the new
> transaction.
> > > So
> > > > we
> > > > > > > want
> > > > > > > > > the marker to use the max epoch  but the new transaction
> > should
> > > > > start
> > > > > > > > with
> > > > > > > > > the new ID and epoch 0 in the transactional state.
> > > > > > > > >
> > > > > > > > > In the server downgrade case, we want to see the producer
> ID
> > as
> > > > > that
> > > > > > is
> > > > > > > > > what the client will have. If we complete the commit, and
> the
> > > > > > > transaction
> > > > > > > > > state is reloaded, we need the new producer ID in the state
> > so
> > > > > there
> > > > > > > > isn't
> > > > > > > > > an invalid producer ID mapping.
> > > > > > > > > The server downgrade cases are considering transactional
> > > > > correctness
> > > > > > > and
> > > > > > > > > not regressing from previous behavior -- and are not
> > concerned
> > > > > about
> > > > > > > > > supporting the safety from fencing retries (as we have
> > > downgraded
> > > > > so
> > > > > > we
> > > > > > > > > don't need to support). Perhaps this is a trade off, but I
> > > think
> > > > it
> > > > > > is
> > > > > > > > the
> > > > > > > > > right one.
> > > > > > > > >
> > > > > > > > > (If the client downgrades, it will have restarted and it is
> > ok
> > > > for
> > > > > it
> > > > > > > to
> > > > > > > > > have a new producer ID too).
> > > > > > > > >
> > > > > > > > > Justine
> > > > > > > > >
> > > > > > > > > On Fri, Jan 12, 2024 at 11:42 AM Jun Rao
> > > > <j...@confluent.io.invalid
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Justine,
> > > > > > > > > >
> > > > > > > > > > Thanks for the reply.
> > > > > > > > > >
> > > > > > > > > > 101.4 "If the marker is written by the new client, we can
> > as
> > > I
> > > > > > > > mentioned
> > > > > > > > > in
> > > > > > > > > > the last email guarantee that any EndTxn requests with
> the
> > > same
> > > > > > epoch
> > > > > > > > are
> > > > > > > > > > from the same producer and the same transaction. Then we
> > > don't
> > > > > have
> > > > > > > to
> > > > > > > > > > return a fenced error but can handle gracefully as
> > described
> > > in
> > > > > the
> > > > > > > > KIP."
> > > > > > > > > > When a delayed EndTnx request is processed, the txn state
> > > could
> > > > > be
> > > > > > > > > ongoing
> > > > > > > > > > for the next txn. I guess in this case we still return
> the
> > > > fenced
> > > > > > > error
> > > > > > > > > for
> > > > > > > > > > the delayed request?
> > > > > > > > > >
> > > > > > > > > > 102. Sorry, my question was inaccurate. What you
> described
> > is
> > > > > > > accurate.
> > > > > > > > > > "The downgrade compatibility I mention is that we keep
> the
> > > same
> > > > > > > > producer
> > > > > > > > > ID
> > > > > > > > > > and epoch in the main (non-tagged) fields as we did
> before
> > > the
> > > > > code
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > > server side." If we want to do this, it seems that we
> > should
> > > > use
> > > > > > the
> > > > > > > > > > current produce Id and max epoch in the existing
> producerId
> > > and
> > > > > > > > > > producerEpoch fields for both the prepare and the
> complete
> > > > > marker,
> > > > > > > > right?
> > > > > > > > > > The downgrade can happen after the complete marker is
> > > written.
> > > > > With
> > > > > > > > what
> > > > > > > > > > you described, the downgraded coordinator will see the
> new
> > > > > produce
> > > > > > Id
> > > > > > > > > > instead of the old one.
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > > On Fri, Jan 12, 2024 at 10:44 AM Justine Olshan
> > > > > > > > > > <jols...@confluent.io.invalid> wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Jun,
> > > > > > > > > > >
> > > > > > > > > > > I can update the description.
> > > > > > > > > > >
> > > > > > > > > > > I believe your second point is mentioned in the KIP. I
> > can
> > > > add
> > > > > > more
> > > > > > > > > text
> > > > > > > > > > on
> > > > > > > > > > > this if it is helpful.
> > > > > > > > > > > > The delayed message case can also violate EOS if the
> > > > delayed
> > > > > > > > message
> > > > > > > > > > > comes in after the next addPartitionsToTxn request
> comes
> > > in.
> > > > > > > > > Effectively
> > > > > > > > > > we
> > > > > > > > > > > may see a message from a previous (aborted) transaction
> > > > become
> > > > > > part
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > next transaction.
> > > > > > > > > > >
> > > > > > > > > > > If the marker is written by the new client, we can as I
> > > > > mentioned
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > last email guarantee that any EndTxn requests with the
> > same
> > > > > epoch
> > > > > > > are
> > > > > > > > > > from
> > > > > > > > > > > the same producer and the same transaction. Then we
> don't
> > > > have
> > > > > to
> > > > > > > > > return
> > > > > > > > > > a
> > > > > > > > > > > fenced error but can handle gracefully as described in
> > the
> > > > KIP.
> > > > > > > > > > > I don't think a boolean is useful since it is directly
> > > > encoded
> > > > > by
> > > > > > > the
> > > > > > > > > > > existence or lack of the tagged field being written.
> > > > > > > > > > > In the prepare marker we will have the same producer ID
> > in
> > > > the
> > > > > > > > > non-tagged
> > > > > > > > > > > field. In the Complete state we may not.
> > > > > > > > > > > I'm not sure why the ongoing state matters for this
> KIP.
> > It
> > > > > does
> > > > > > > > matter
> > > > > > > > > > for
> > > > > > > > > > > KIP-939.
> > > > > > > > > > >
> > > > > > > > > > > I'm not sure what you are referring to about writing
> the
> > > > > previous
> > > > > > > > > > producer
> > > > > > > > > > > ID in the prepare marker. This is not in the KIP.
> > > > > > > > > > > In the overflow case, we write the nextProducerId in
> the
> > > > > prepare
> > > > > > > > state.
> > > > > > > > > > > This is so we know what we assigned when we reload the
> > > > > > transaction
> > > > > > > > log.
> > > > > > > > > > > Once we complete, we transition this ID to the main
> > > > (non-tagged
> > > > > > > > field)
> > > > > > > > > > and
> > > > > > > > > > > have the previous producer ID field filled in. This is
> so
> > > we
> > > > > can
> > > > > > > > > identify
> > > > > > > > > > > in a retry case the operation completed successfully
> and
> > we
> > > > > don't
> > > > > > > > fence
> > > > > > > > > > our
> > > > > > > > > > > producer. The downgrade compatibility I mention is that
> > we
> > > > keep
> > > > > > the
> > > > > > > > > same
> > > > > > > > > > > producer ID and epoch in the main (non-tagged) fields
> as
> > we
> > > > did
> > > > > > > > before
> > > > > > > > > > the
> > > > > > > > > > > code on the server side. If the server downgrades, we
> are
> > > > still
> > > > > > > > > > compatible.
> > > > > > > > > > > This addresses both the prepare and complete state
> > > > downgrades.
> > > > > > > > > > >
> > > > > > > > > > > Justine
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Jan 12, 2024 at 10:21 AM Jun Rao
> > > > > > <j...@confluent.io.invalid
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi, Justine,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for the reply. Sorry for the delay. I have a
> few
> > > > more
> > > > > > > > > comments.
> > > > > > > > > > > >
> > > > > > > > > > > > 110. I think the motivation section could be
> improved.
> > > One
> > > > of
> > > > > > the
> > > > > > > > > > > > motivations listed by the KIP is "This can happen
> when
> > a
> > > > > > message
> > > > > > > > gets
> > > > > > > > > > > stuck
> > > > > > > > > > > > or delayed due to networking issues or a network
> > > partition,
> > > > > the
> > > > > > > > > > > transaction
> > > > > > > > > > > > aborts, and then the delayed message finally comes
> > in.".
> > > > This
> > > > > > > seems
> > > > > > > > > not
> > > > > > > > > > > > very accurate. Without KIP-890, currently, if the
> > > > coordinator
> > > > > > > times
> > > > > > > > > out
> > > > > > > > > > > and
> > > > > > > > > > > > aborts an ongoing transaction, it already bumps up
> the
> > > > epoch
> > > > > in
> > > > > > > the
> > > > > > > > > > > marker,
> > > > > > > > > > > > which prevents the delayed produce message from being
> > > added
> > > > > to
> > > > > > > the
> > > > > > > > > user
> > > > > > > > > > > > partition. What can cause a hanging transaction is
> that
> > > the
> > > > > > > > producer
> > > > > > > > > > > > completes (either aborts or commits) a transaction
> > before
> > > > > > > > receiving a
> > > > > > > > > > > > successful ack on messages published in the same txn.
> > In
> > > > this
> > > > > > > case,
> > > > > > > > > > it's
> > > > > > > > > > > > possible for the delayed message to be appended to
> the
> > > > > > partition
> > > > > > > > > after
> > > > > > > > > > > the
> > > > > > > > > > > > marker, causing a transaction to hang.
> > > > > > > > > > > >
> > > > > > > > > > > > A similar issue (not mentioned in the motivation)
> could
> > > > > happen
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > > marker in the coordinator's log. For example, it's
> > > possible
> > > > > for
> > > > > > > an
> > > > > > > > > > > > EndTxnRequest to be delayed on the coordinator. By
> the
> > > time
> > > > > the
> > > > > > > > > delayed
> > > > > > > > > > > > EndTxnRequest is processed, it's possible that the
> > > previous
> > > > > txn
> > > > > > > has
> > > > > > > > > > > already
> > > > > > > > > > > > completed and a new txn has started. Currently, since
> > the
> > > > > epoch
> > > > > > > is
> > > > > > > > > not
> > > > > > > > > > > > bumped on every txn, the delayed EndTxnRequest will
> add
> > > an
> > > > > > > > unexpected
> > > > > > > > > > > > prepare marker (and eventually a complete marker) to
> > the
> > > > > > ongoing
> > > > > > > > txn.
> > > > > > > > > > > This
> > > > > > > > > > > > won't cause the transaction to hang, but it will
> break
> > > the
> > > > > EoS
> > > > > > > > > > semantic.
> > > > > > > > > > > > The proposal in this KIP will address this issue too.
> > > > > > > > > > > >
> > > > > > > > > > > > 101. "However, I was writing it so that we can
> > > distinguish
> > > > > > > between
> > > > > > > > > > > > old clients where we don't have the ability do this
> > > > operation
> > > > > > and
> > > > > > > > new
> > > > > > > > > > > > clients that can. (Old clients don't bump the epoch
> on
> > > > > commit,
> > > > > > so
> > > > > > > > we
> > > > > > > > > > > can't
> > > > > > > > > > > > say for sure the write belongs to the given
> > > transaction)."
> > > > > > > > > > > > 101.1 I am wondering why we need to distinguish
> whether
> > > the
> > > > > > > marker
> > > > > > > > is
> > > > > > > > > > > > written by the old and the new client. Could you
> > describe
> > > > > what
> > > > > > we
> > > > > > > > do
> > > > > > > > > > > > differently if we know the marker is written by the
> new
> > > > > client?
> > > > > > > > > > > > 101.2 If we do need a way to distinguish whether the
> > > marker
> > > > > is
> > > > > > > > > written
> > > > > > > > > > by
> > > > > > > > > > > > the old and the new client. Would it be simpler to
> just
> > > > > > > introduce a
> > > > > > > > > > > boolean
> > > > > > > > > > > > field instead of indirectly through the previous
> > produce
> > > ID
> > > > > > > field?
> > > > > > > > > > > > 101.3 It's not clear to me why we only add the
> previous
> > > > > produce
> > > > > > > ID
> > > > > > > > > > field
> > > > > > > > > > > in
> > > > > > > > > > > > the complete marker, but not in the prepare marker.
> If
> > we
> > > > > want
> > > > > > to
> > > > > > > > > know
> > > > > > > > > > > > whether a marker is written by the new client or not,
> > it
> > > > > seems
> > > > > > > that
> > > > > > > > > we
> > > > > > > > > > > want
> > > > > > > > > > > > to do this consistently for all markers.
> > > > > > > > > > > > 101.4 What about the TransactionLogValue record
> > > > representing
> > > > > > the
> > > > > > > > > > ongoing
> > > > > > > > > > > > state? Should we also distinguish whether it's
> written
> > by
> > > > the
> > > > > > old
> > > > > > > > or
> > > > > > > > > > the
> > > > > > > > > > > > new client?
> > > > > > > > > > > >
> > > > > > > > > > > > 102. In the overflow case, it's still not clear to me
> > why
> > > > we
> > > > > > > write
> > > > > > > > > the
> > > > > > > > > > > > previous produce Id in the prepare marker while
> writing
> > > the
> > > > > > next
> > > > > > > > > > produce
> > > > > > > > > > > Id
> > > > > > > > > > > > in the complete marker. You mentioned that it's for
> > > > > > downgrading.
> > > > > > > > > > However,
> > > > > > > > > > > > we could downgrade with either the prepare marker or
> > the
> > > > > > complete
> > > > > > > > > > marker.
> > > > > > > > > > > > In either case, the downgraded coordinator should see
> > the
> > > > > same
> > > > > > > > > produce
> > > > > > > > > > id
> > > > > > > > > > > > (probably the previous produce Id), right?
> > > > > > > > > > > >
> > > > > > > > > > > > Jun
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Dec 20, 2023 at 6:00 PM Justine Olshan
> > > > > > > > > > > > <jols...@confluent.io.invalid>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hey Jun,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for taking a look at the KIP again.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 100. For the epoch overflow case, only the marker
> > will
> > > > have
> > > > > > max
> > > > > > > > > > epoch.
> > > > > > > > > > > > This
> > > > > > > > > > > > > keeps the behavior of the rest of the markers where
> > the
> > > > > last
> > > > > > > > marker
> > > > > > > > > > is
> > > > > > > > > > > > the
> > > > > > > > > > > > > epoch of the transaction records + 1.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 101. You are correct that we don't need to write
> the
> > > > > producer
> > > > > > > ID
> > > > > > > > > > since
> > > > > > > > > > > it
> > > > > > > > > > > > > is the same. However, I was writing it so that we
> can
> > > > > > > distinguish
> > > > > > > > > > > between
> > > > > > > > > > > > > old clients where we don't have the ability do this
> > > > > operation
> > > > > > > and
> > > > > > > > > new
> > > > > > > > > > > > > clients that can. (Old clients don't bump the epoch
> > on
> > > > > > commit,
> > > > > > > so
> > > > > > > > > we
> > > > > > > > > > > > can't
> > > > > > > > > > > > > say for sure the write belongs to the given
> > > transaction).
> > > > > If
> > > > > > we
> > > > > > > > > > receive
> > > > > > > > > > > > an
> > > > > > > > > > > > > EndTxn request from a new client, we will fill this
> > > > field.
> > > > > We
> > > > > > > can
> > > > > > > > > > > > guarantee
> > > > > > > > > > > > > that any EndTxn requests with the same epoch are
> from
> > > the
> > > > > > same
> > > > > > > > > > producer
> > > > > > > > > > > > and
> > > > > > > > > > > > > the same transaction.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 102. In prepare phase, we have the same producer ID
> > and
> > > > > epoch
> > > > > > > we
> > > > > > > > > > always
> > > > > > > > > > > > > had. It is the producer ID and epoch that are on
> the
> > > > > marker.
> > > > > > In
> > > > > > > > > > commit
> > > > > > > > > > > > > phase, we stay the same unless it is the overflow
> > case.
> > > > In
> > > > > > that
> > > > > > > > > case,
> > > > > > > > > > > we
> > > > > > > > > > > > > set the producer ID to the new one we generated and
> > > epoch
> > > > > to
> > > > > > 0
> > > > > > > > > after
> > > > > > > > > > > > > complete. This is for downgrade compatibility. The
> > > tagged
> > > > > > > fields
> > > > > > > > > are
> > > > > > > > > > > just
> > > > > > > > > > > > > safety guards for retries and failovers.
> > > > > > > > > > > > >
> > > > > > > > > > > > > In prepare phase for epoch overflow case only we
> > store
> > > > the
> > > > > > next
> > > > > > > > > > > producer
> > > > > > > > > > > > > ID. This is for the case where we reload the
> > > transaction
> > > > > > > > > coordinator
> > > > > > > > > > in
> > > > > > > > > > > > > prepare state. Once the transaction is committed,
> we
> > > can
> > > > > use
> > > > > > > the
> > > > > > > > > > > producer
> > > > > > > > > > > > > ID the client already is using.
> > > > > > > > > > > > >
> > > > > > > > > > > > > In commit phase, we store the previous producer ID
> in
> > > > case
> > > > > of
> > > > > > > > > > retries.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think it is easier to think of it as just how we
> > were
> > > > > > storing
> > > > > > > > > > > producer
> > > > > > > > > > > > ID
> > > > > > > > > > > > > and epoch before, with some extra bookeeping and
> edge
> > > > case
> > > > > > > > handling
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > tagged fields. We have to do it this way for
> > > > compatibility
> > > > > > with
> > > > > > > > > > > > downgrades.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 103. Next producer ID is for prepare status and
> > > previous
> > > > > > > producer
> > > > > > > > > ID
> > > > > > > > > > is
> > > > > > > > > > > > for
> > > > > > > > > > > > > after complete. The reason why we need two separate
> > > > > (tagged)
> > > > > > > > fields
> > > > > > > > > > is
> > > > > > > > > > > > for
> > > > > > > > > > > > > backwards compatibility. We need to keep the same
> > > > semantics
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > > > non-tagged field in case we downgrade.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 104. We set the fields as we do in the
> transactional
> > > > state
> > > > > > (as
> > > > > > > we
> > > > > > > > > > need
> > > > > > > > > > > to
> > > > > > > > > > > > > do this for compatibility -- if we downgrade, we
> will
> > > > only
> > > > > > have
> > > > > > > > the
> > > > > > > > > > > > > non-tagged fields) It will be the old producer ID
> and
> > > max
> > > > > > > epoch.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hope this helps. Let me know if you have further
> > > > questions.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Justine
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Dec 20, 2023 at 3:33 PM Jun Rao
> > > > > > > <j...@confluent.io.invalid
> > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi, Justine,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It seems that you have made some changes to
> KIP-890
> > > > since
> > > > > > the
> > > > > > > > > vote.
> > > > > > > > > > > In
> > > > > > > > > > > > > > particular, we are changing the format of
> > > > > > > TransactionLogValue.
> > > > > > > > A
> > > > > > > > > > few
> > > > > > > > > > > > > > comments related to that.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 100. Just to be clear. The overflow case (i.e.
> > when a
> > > > new
> > > > > > > > > > producerId
> > > > > > > > > > > is
> > > > > > > > > > > > > > generated) is when the current epoch equals to
> max
> > -
> > > 1
> > > > > and
> > > > > > > not
> > > > > > > > > max?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 101. For the "not epoch overflow" case, we write
> > the
> > > > > > previous
> > > > > > > > ID
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > > tagged field in the complete phase. Do we need to
> > do
> > > > that
> > > > > > > since
> > > > > > > > > > > produce
> > > > > > > > > > > > > id
> > > > > > > > > > > > > > doesn't change in this case?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 102. It seems that the meaning for the
> > > > > > > ProducerId/ProducerEpoch
> > > > > > > > > > > fields
> > > > > > > > > > > > in
> > > > > > > > > > > > > > TransactionLogValue changes depending on the
> > > > > > > TransactionStatus.
> > > > > > > > > > When
> > > > > > > > > > > > > > the TransactionStatus is ongoing, they represent
> > the
> > > > > > current
> > > > > > > > > > > ProducerId
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > the current ProducerEpoch. When the
> > TransactionStatus
> > > > is
> > > > > > > > > > > > > > PrepareCommit/PrepareAbort, they represent the
> > > current
> > > > > > > > ProducerId
> > > > > > > > > > and
> > > > > > > > > > > > the
> > > > > > > > > > > > > > next ProducerEpoch. When the TransactionStatus is
> > > > > > > Commit/Abort,
> > > > > > > > > > they
> > > > > > > > > > > > > > further depend on whether the epoch overflows or
> > not.
> > > > If
> > > > > > > there
> > > > > > > > is
> > > > > > > > > > no
> > > > > > > > > > > > > > overflow, they represent  the current ProducerId
> > and
> > > > the
> > > > > > next
> > > > > > > > > > > > > ProducerEpoch
> > > > > > > > > > > > > > (max). Otherwise, they represent the newly
> > generated
> > > > > > > ProducerId
> > > > > > > > > > and a
> > > > > > > > > > > > > > ProducerEpoch of 0. Is that right? This seems not
> > > easy
> > > > to
> > > > > > > > > > understand.
> > > > > > > > > > > > > Could
> > > > > > > > > > > > > > we provide some examples like what Artem has done
> > in
> > > > > > KIP-939?
> > > > > > > > > Have
> > > > > > > > > > we
> > > > > > > > > > > > > > considered a simpler design where
> > > > > ProducerId/ProducerEpoch
> > > > > > > > always
> > > > > > > > > > > > > represent
> > > > > > > > > > > > > > the same value (e.g. for the current transaction)
> > > > > > independent
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > > TransactionStatus and epoch overflow?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 103. It's not clear to me why we need 3 fields:
> > > > > ProducerId,
> > > > > > > > > > > > > PrevProducerId,
> > > > > > > > > > > > > > NextProducerId. Could we just have ProducerId and
> > > > > > > > NextProducerId?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 104. For WriteTxnMarkerRequests, if the producer
> > > epoch
> > > > > > > > overflows,
> > > > > > > > > > > what
> > > > > > > > > > > > do
> > > > > > > > > > > > > > we set the producerId and the producerEpoch?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to