Hey folks, just wanted to share another quick update with KIP-890.

Making the change to bump epoch after every transaction means that we no
longer need to call InitProducerId during the life of the producer to fence
requests from previous transactions. With KIP-890 part 2, the new client
will no longer call InitProducerId except on startup to fence a previous
instance.

With this change, some other calls to InitProducerId were inspected
including the call after receiving an InvalidPidMappingException. This
exception was changed to abortable as part of KIP-360: Improve reliability
of idempotent/transactional producer
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=89068820>.
However, this change means that we can violate EOS guarantees. As an
example:

Consider an application that is copying data from one partition to another

   - Application instance A processes to offset 4
   - Application instance B comes up and fences application instance A
   - Application instance B processes to offset 5
   - Application instances A and B are idle for transaction.id.expiration.ms,
   transaction id expires on server
   - Application instance A attempts to process offset 5 (since in its
   view, that is next) -- if we recover from invalid pid mapping, we can
   duplicate this processing

Thus, INVALID_PID_MAPPING should be fatal to the producer and as part of
KIP-890 the new client in 4.0 will treat the exception as fatal.

This is consistent with KIP-1050: Consistent error handling for Transactions
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-1050%3A+Consistent+error+handling+for+Transactions>
where
errors that are fatal to the producer are in the "application recoverable"
category. This is a grouping that indicates to the client that the producer
needs to restart and recovery on the application side is necessary.
KIP-1050 is approved so we are consistent with that decision.

Let me know if there are any questions. I've updated the KIP to reflect
this change.

Justine

On Wed, Jul 10, 2024 at 3:34 PM Jun Rao <j...@confluent.io.invalid> wrote:

> Sounds good, Justine. It would be useful to document that in the KIP.
>
> Thanks,
>
> Jun
>
> On Wed, Jul 10, 2024 at 2:59 PM Justine Olshan
> <jols...@confluent.io.invalid>
> wrote:
>
> > The client will send the newest EndTxn request version if and only if
> both
> > the client and the server support kip-890 part 2.
> > We set the value in the record based on the EndTxn version.
> >
> > Justine
> >
> > On Wed, Jul 10, 2024 at 2:50 PM Jun Rao <j...@confluent.io.invalid>
> wrote:
> >
> > > Hi, Justine,
> > >
> > > Thanks for the reply.
> > >
> > > 120. If the broker sends TV Y for the finalized version in
> > > ApiVersionResponse, but the client doesn't support Y, how does the
> broker
> > > know the TV that the client supports?
> > >
> > > Jun
> > >
> > > On Wed, Jul 10, 2024 at 2:29 PM Justine Olshan
> > > <jols...@confluent.io.invalid>
> > > wrote:
> > >
> > > > Hey Jun,
> > > >
> > > > No worries. Work on this KIP has been blocked for a bit anyways --
> > > catching
> > > > up and rereading what I wrote :)
> > > >
> > > > 120. ClientTransactionProtocolVersion is the transaction version as
> > > defined
> > > > by the highest transaction version (feature version value) supported
> by
> > > the
> > > > client and the server. This works by the broker sending an
> > > > ApiVersionsRequest to the client with the finalized version. Assuming
> > > > kip-890 part 2 is enabled by transaction version Y, if this request
> > > > contains finalized version Y and the client has the logic to set this
> > > > field, it will set Y. If the server has Y - 1 (kip 890 part 2 not
> > enable)
> > > > the client will send Y - 1, even though the client has the ability to
> > > > support kip-890 part 2.
> > > >
> > > > 121. You are correct that this is not needed. However, currently that
> > > field
> > > > is already being set in memory -- just not written to disk. I think
> it
> > is
> > > > ok to write it to disk though. Let me know if you think otherwise.
> > > >
> > > > Justine
> > > >
> > > > On Wed, Jul 10, 2024 at 2:16 PM Jun Rao <j...@confluent.io.invalid>
> > > wrote:
> > > >
> > > > > Hi, Justine,
> > > > >
> > > > > Thanks for the update and sorry for the late reply.
> > > > >
> > > > > 120. I am wondering what value is used for
> > > > > ClientTransactionProtocolVersion. Is it the version of the
> > > EndTxnRequest?
> > > > >
> > > > > 121. Earlier, you made the change to set lastProducerId in PREPARE
> to
> > > > > indicate that the marker is written for the new client. With the
> new
> > > > > ClientTransactionProtocolVersion field, it seems this is no longer
> > > > > necessary.
> > > > >
> > > > > Jun
> > > > >
> > > > > On Thu, Mar 28, 2024 at 2:41 PM Justine Olshan
> > > > > <jols...@confluent.io.invalid>
> > > > > wrote:
> > > > >
> > > > > > Hi there -- another update!
> > > > > >
> > > > > > When looking into the implementation for the safe epoch bumps I
> > > > realized
> > > > > > that we are already populating previousProducerID in memory as
> part
> > > of
> > > > > > KIP-360.
> > > > > > If we are to start using flexible fields, it is better to always
> > use
> > > > this
> > > > > > information and have an explicit (tagged) field to indicate
> whether
> > > the
> > > > > > client supports KIP-890 part 2.
> > > > > >
> > > > > > I've included the extra field and how it is set in the KIP. I've
> > also
> > > > > > updated the KIP to explain that we will be setting the tagged
> > fields
> > > > when
> > > > > > they are available for all transitions.
> > > > > >
> > > > > > Finally, I added clearer text about the transaction protocol
> > versions
> > > > > > included with this KIP. 1 for flexible transaction state records
> > and
> > > 2
> > > > > for
> > > > > > KIP-890 part 2 enablement.
> > > > > >
> > > > > > Justine
> > > > > >
> > > > > > On Mon, Mar 18, 2024 at 6:39 PM Justine Olshan <
> > jols...@confluent.io
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hey there -- small update to the KIP,
> > > > > > >
> > > > > > > The KIP mentioned introducing ABORTABLE_ERROR and bumping
> > > > > TxnOffsetCommit
> > > > > > > and Produce requests. I've changed the name in the KIP to
> > > > > > > ABORTABLE_TRANSACTION and the corresponding exception
> > > > > > > AbortableTransactionException to match the pattern we had for
> > other
> > > > > > errors.
> > > > > > > I also mentioned bumping all 6 transactional APIs so we can
> > future
> > > > > > > proof/support the error on the client going forward. If a
> future
> > > > change
> > > > > > > wants to have an error scenario that requires us to abort the
> > > > > > transaction,
> > > > > > > we can rely on the 3.8+ clients to support it. We ran into
> issues
> > > > > finding
> > > > > > > good/generic error codes that older clients could support while
> > > > working
> > > > > > on
> > > > > > > this KIP, so this should help in the future.
> > > > > > >
> > > > > > > The features discussion is still ongoing in KIP-1022. Will
> update
> > > > again
> > > > > > > here when that concludes.
> > > > > > >
> > > > > > > Justine
> > > > > > >
> > > > > > > On Tue, Feb 6, 2024 at 8:39 AM Justine Olshan <
> > > jols...@confluent.io>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> I don't think AddPartitions is a good example since we
> currenly
> > > > don't
> > > > > > >> gate the version on TV or MV. (We only set a different flag
> > > > depending
> > > > > on
> > > > > > >> the TV)
> > > > > > >>
> > > > > > >> Even if we did want to gate it on TV, I think the idea is to
> > move
> > > > away
> > > > > > >> from MV gating inter broker protocols. Ideally we can get to a
> > > state
> > > > > > where
> > > > > > >> MV is just used for metadata changes.
> > > > > > >>
> > > > > > >> I think some of this discussion might fit more with the
> feature
> > > > > version
> > > > > > >> KIP, so I can try to open that up soon. Until we settle that,
> > some
> > > > of
> > > > > > the
> > > > > > >> work in KIP-890 is blocked.
> > > > > > >>
> > > > > > >> Justine
> > > > > > >>
> > > > > > >> On Mon, Feb 5, 2024 at 5:38 PM Jun Rao
> <j...@confluent.io.invalid
> > >
> > > > > > wrote:
> > > > > > >>
> > > > > > >>> Hi, Justine,
> > > > > > >>>
> > > > > > >>> Thanks for the reply.
> > > > > > >>>
> > > > > > >>> Since AddPartitions is an inter broker request, will its
> > version
> > > be
> > > > > > gated
> > > > > > >>> only by TV or other features like MV too? For example, if we
> > need
> > > > to
> > > > > > >>> change
> > > > > > >>> the protocol for AddPartitions for reasons other than txn
> > > > > verification
> > > > > > in
> > > > > > >>> the future, will the new version be gated by a new MV? If so,
> > > does
> > > > > > >>> downgrading a TV imply potential downgrade of MV too?
> > > > > > >>>
> > > > > > >>> Jun
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Mon, Feb 5, 2024 at 5:07 PM Justine Olshan
> > > > > > >>> <jols...@confluent.io.invalid>
> > > > > > >>> wrote:
> > > > > > >>>
> > > > > > >>> > One TV gates the flexible feature version (no rpcs
> involved,
> > > only
> > > > > the
> > > > > > >>> > transactional records that should only be gated by TV)
> > > > > > >>> > Another TV gates the ability to turn on kip-890 part 2.
> This
> > > > would
> > > > > > >>> gate the
> > > > > > >>> > version of Produce and EndTxn (likely only used by
> > > transactions),
> > > > > and
> > > > > > >>> > specifies a flag in AddPartitionsToTxn though the version
> is
> > > > > already
> > > > > > >>> used
> > > > > > >>> > without TV.
> > > > > > >>> >
> > > > > > >>> > I think the only concern is the Produce request and we
> could
> > > > > consider
> > > > > > >>> work
> > > > > > >>> > arounds similar to the AddPartitionsToTxn call.
> > > > > > >>> >
> > > > > > >>> > Justine
> > > > > > >>> >
> > > > > > >>> > On Mon, Feb 5, 2024 at 4:56 PM Jun Rao
> > > <j...@confluent.io.invalid
> > > > >
> > > > > > >>> wrote:
> > > > > > >>> >
> > > > > > >>> > > Hi, Justine,
> > > > > > >>> > >
> > > > > > >>> > > Which PRC/record protocols will TV guard? Going forward,
> > will
> > > > > those
> > > > > > >>> > > PRC/record protocols only be guarded by TV and not by
> other
> > > > > > features
> > > > > > >>> like
> > > > > > >>> > > MV?
> > > > > > >>> > >
> > > > > > >>> > > Thanks,
> > > > > > >>> > >
> > > > > > >>> > > Jun
> > > > > > >>> > >
> > > > > > >>> > > On Mon, Feb 5, 2024 at 2:41 PM Justine Olshan
> > > > > > >>> > <jols...@confluent.io.invalid
> > > > > > >>> > > >
> > > > > > >>> > > wrote:
> > > > > > >>> > >
> > > > > > >>> > > > Hi Jun,
> > > > > > >>> > > >
> > > > > > >>> > > > Sorry I think I misunderstood your question or answered
> > > > > > >>> incorrectly.
> > > > > > >>> > The
> > > > > > >>> > > TV
> > > > > > >>> > > > version should ideally be fully independent from MV.
> > > > > > >>> > > > At least for the changes I proposed, TV should not
> affect
> > > MV
> > > > > and
> > > > > > MV
> > > > > > >>> > > should
> > > > > > >>> > > > not affect TV/
> > > > > > >>> > > >
> > > > > > >>> > > > I think if we downgrade TV, only that feature should
> > > > downgrade.
> > > > > > >>> > Likewise
> > > > > > >>> > > > the same with MV. The finalizedFeatures should just
> > reflect
> > > > the
> > > > > > >>> feature
> > > > > > >>> > > > downgrade we made.
> > > > > > >>> > > >
> > > > > > >>> > > > I also plan to write a new KIP for managing the disk
> > format
> > > > and
> > > > > > >>> upgrade
> > > > > > >>> > > > tool as we will need new flags to support these
> features.
> > > > That
> > > > > > >>> should
> > > > > > >>> > > help
> > > > > > >>> > > > clarify some things.
> > > > > > >>> > > >
> > > > > > >>> > > > Justine
> > > > > > >>> > > >
> > > > > > >>> > > > On Mon, Feb 5, 2024 at 11:03 AM Jun Rao
> > > > > <j...@confluent.io.invalid
> > > > > > >
> > > > > > >>> > > wrote:
> > > > > > >>> > > >
> > > > > > >>> > > > > Hi, Justine,
> > > > > > >>> > > > >
> > > > > > >>> > > > > Thanks for the reply.
> > > > > > >>> > > > >
> > > > > > >>> > > > > So, if we downgrade TV, we could implicitly downgrade
> > > > another
> > > > > > >>> feature
> > > > > > >>> > > > (say
> > > > > > >>> > > > > MV) that has dependency (e.g. RPC). What would we
> > return
> > > > for
> > > > > > >>> > > > > FinalizedFeatures for MV in ApiVersionsResponse in
> that
> > > > case?
> > > > > > >>> > > > >
> > > > > > >>> > > > > Thanks,
> > > > > > >>> > > > >
> > > > > > >>> > > > > Jun
> > > > > > >>> > > > >
> > > > > > >>> > > > > On Fri, Feb 2, 2024 at 1:06 PM Justine Olshan
> > > > > > >>> > > > <jols...@confluent.io.invalid
> > > > > > >>> > > > > >
> > > > > > >>> > > > > wrote:
> > > > > > >>> > > > >
> > > > > > >>> > > > > > Hey Jun,
> > > > > > >>> > > > > >
> > > > > > >>> > > > > > Yes, the idea is that if we downgrade TV
> (transaction
> > > > > > version)
> > > > > > >>> we
> > > > > > >>> > > will
> > > > > > >>> > > > > stop
> > > > > > >>> > > > > > using the add partitions to txn optimization and
> stop
> > > > > writing
> > > > > > >>> the
> > > > > > >>> > > > > flexible
> > > > > > >>> > > > > > feature version of the log.
> > > > > > >>> > > > > > In the compatibility section I included some
> > > explanations
> > > > > on
> > > > > > >>> how
> > > > > > >>> > this
> > > > > > >>> > > > is
> > > > > > >>> > > > > > done.
> > > > > > >>> > > > > >
> > > > > > >>> > > > > > Thanks,
> > > > > > >>> > > > > > Justine
> > > > > > >>> > > > > >
> > > > > > >>> > > > > > On Fri, Feb 2, 2024 at 11:12 AM Jun Rao
> > > > > > >>> <j...@confluent.io.invalid>
> > > > > > >>> > > > > wrote:
> > > > > > >>> > > > > >
> > > > > > >>> > > > > > > Hi, Justine,
> > > > > > >>> > > > > > >
> > > > > > >>> > > > > > > Thanks for the update.
> > > > > > >>> > > > > > >
> > > > > > >>> > > > > > > If we ever downgrade the transaction feature, any
> > > > feature
> > > > > > >>> > depending
> > > > > > >>> > > > on
> > > > > > >>> > > > > > > changes on top of those RPC/record
> > > > > > >>> > > > > > > (AddPartitionsToTxnRequest/TransactionLogValue)
> > > changes
> > > > > > made
> > > > > > >>> in
> > > > > > >>> > > > KIP-890
> > > > > > >>> > > > > > > will be automatically downgraded too?
> > > > > > >>> > > > > > >
> > > > > > >>> > > > > > > Jun
> > > > > > >>> > > > > > >
> > > > > > >>> > > > > > > On Tue, Jan 30, 2024 at 3:32 PM Justine Olshan
> > > > > > >>> > > > > > > <jols...@confluent.io.invalid>
> > > > > > >>> > > > > > > wrote:
> > > > > > >>> > > > > > >
> > > > > > >>> > > > > > > > Hey Jun,
> > > > > > >>> > > > > > > >
> > > > > > >>> > > > > > > > I wanted to get back to you about your
> questions
> > > > about
> > > > > > >>> MV/IBP.
> > > > > > >>> > > > > > > >
> > > > > > >>> > > > > > > > Looking at the options, I think it makes the
> most
> > > > sense
> > > > > > to
> > > > > > >>> > > create a
> > > > > > >>> > > > > > > > separate feature for transactions and use that
> to
> > > > > version
> > > > > > >>> gate
> > > > > > >>> > > the
> > > > > > >>> > > > > > > features
> > > > > > >>> > > > > > > > we need to version gate (flexible transactional
> > > state
> > > > > > >>> records
> > > > > > >>> > and
> > > > > > >>> > > > > using
> > > > > > >>> > > > > > > the
> > > > > > >>> > > > > > > > new protocol)
> > > > > > >>> > > > > > > > I've updated the KIP to include this change.
> > > > Hopefully
> > > > > > >>> that's
> > > > > > >>> > > > > > everything
> > > > > > >>> > > > > > > we
> > > > > > >>> > > > > > > > need for this KIP :)
> > > > > > >>> > > > > > > >
> > > > > > >>> > > > > > > > Justine
> > > > > > >>> > > > > > > >
> > > > > > >>> > > > > > > >
> > > > > > >>> > > > > > > > On Mon, Jan 22, 2024 at 3:17 PM Justine Olshan
> <
> > > > > > >>> > > > jols...@confluent.io
> > > > > > >>> > > > > >
> > > > > > >>> > > > > > > > wrote:
> > > > > > >>> > > > > > > >
> > > > > > >>> > > > > > > > > Thanks Jun,
> > > > > > >>> > > > > > > > >
> > > > > > >>> > > > > > > > > I will update the KIP with the prev field for
> > > > prepare
> > > > > > as
> > > > > > >>> > well.
> > > > > > >>> > > > > > > > >
> > > > > > >>> > > > > > > > > PREPARE
> > > > > > >>> > > > > > > > > producerId: x
> > > > > > >>> > > > > > > > > previous/lastProducerId (tagged field): x
> > > > > > >>> > > > > > > > > nextProducerId (tagged field): empty or z if
> y
> > > will
> > > > > > >>> overflow
> > > > > > >>> > > > > > > > > producerEpoch: y + 1
> > > > > > >>> > > > > > > > >
> > > > > > >>> > > > > > > > > COMPLETE
> > > > > > >>> > > > > > > > > producerId: x or z if y overflowed
> > > > > > >>> > > > > > > > > previous/lastProducerId (tagged field): x
> > > > > > >>> > > > > > > > > nextProducerId (tagged field): empty
> > > > > > >>> > > > > > > > > producerEpoch: y + 1 or 0 if we overflowed
> > > > > > >>> > > > > > > > >
> > > > > > >>> > > > > > > > > Thanks again,
> > > > > > >>> > > > > > > > > Justine
> > > > > > >>> > > > > > > > >
> > > > > > >>> > > > > > > > > On Mon, Jan 22, 2024 at 3:15 PM Jun Rao
> > > > > > >>> > > <j...@confluent.io.invalid
> > > > > > >>> > > > >
> > > > > > >>> > > > > > > > wrote:
> > > > > > >>> > > > > > > > >
> > > > > > >>> > > > > > > > >> Hi, Justine,
> > > > > > >>> > > > > > > > >>
> > > > > > >>> > > > > > > > >> 101.3 Thanks for the explanation.
> > > > > > >>> > > > > > > > >> (1) My point was that the coordinator could
> > fail
> > > > > right
> > > > > > >>> after
> > > > > > >>> > > > > writing
> > > > > > >>> > > > > > > the
> > > > > > >>> > > > > > > > >> prepare marker. When the new txn coordinator
> > > > > generates
> > > > > > >>> the
> > > > > > >>> > > > > complete
> > > > > > >>> > > > > > > > marker
> > > > > > >>> > > > > > > > >> after the failover, it needs some field from
> > the
> > > > > > prepare
> > > > > > >>> > > marker
> > > > > > >>> > > > to
> > > > > > >>> > > > > > > > >> determine whether it's written by the new
> > > client.
> > > > > > >>> > > > > > > > >>
> > > > > > >>> > > > > > > > >> (2) The changing of the behavior sounds good
> > to
> > > > me.
> > > > > We
> > > > > > >>> only
> > > > > > >>> > > want
> > > > > > >>> > > > > to
> > > > > > >>> > > > > > > > return
> > > > > > >>> > > > > > > > >> success if the prepare state is written by
> the
> > > new
> > > > > > >>> client.
> > > > > > >>> > So,
> > > > > > >>> > > > in
> > > > > > >>> > > > > > the
> > > > > > >>> > > > > > > > >> non-overflow case, it seems that we also
> need
> > > sth
> > > > in
> > > > > > the
> > > > > > >>> > > prepare
> > > > > > >>> > > > > > > marker
> > > > > > >>> > > > > > > > to
> > > > > > >>> > > > > > > > >> tell us whether it's written by the new
> > client.
> > > > > > >>> > > > > > > > >>
> > > > > > >>> > > > > > > > >> 112. Thanks for the explanation. That sounds
> > > good
> > > > to
> > > > > > me.
> > > > > > >>> > > > > > > > >>
> > > > > > >>> > > > > > > > >> Jun
> > > > > > >>> > > > > > > > >>
> > > > > > >>> > > > > > > > >> On Mon, Jan 22, 2024 at 11:32 AM Justine
> > Olshan
> > > > > > >>> > > > > > > > >> <jols...@confluent.io.invalid> wrote:
> > > > > > >>> > > > > > > > >>
> > > > > > >>> > > > > > > > >> > 101.3 I realized that I actually have two
> > > > > questions.
> > > > > > >>> > > > > > > > >> > > (1) In the non-overflow case, we need to
> > > write
> > > > > the
> > > > > > >>> > > previous
> > > > > > >>> > > > > > > produce
> > > > > > >>> > > > > > > > Id
> > > > > > >>> > > > > > > > >> > tagged field in the end maker so that we
> > know
> > > if
> > > > > the
> > > > > > >>> > marker
> > > > > > >>> > > is
> > > > > > >>> > > > > > from
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> new
> > > > > > >>> > > > > > > > >> > client. Since the end maker is derived
> from
> > > the
> > > > > > >>> prepare
> > > > > > >>> > > > marker,
> > > > > > >>> > > > > > > should
> > > > > > >>> > > > > > > > >> we
> > > > > > >>> > > > > > > > >> > write the previous produce Id in the
> prepare
> > > > > marker
> > > > > > >>> field
> > > > > > >>> > > too?
> > > > > > >>> > > > > > > > >> Otherwise,
> > > > > > >>> > > > > > > > >> > we will lose this information when
> deriving
> > > the
> > > > > end
> > > > > > >>> > marker.
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> > The "previous" producer ID is in the
> normal
> > > > > producer
> > > > > > >>> ID
> > > > > > >>> > > field.
> > > > > > >>> > > > > So
> > > > > > >>> > > > > > > yes,
> > > > > > >>> > > > > > > > >> we
> > > > > > >>> > > > > > > > >> > need it in prepare and that was always the
> > > plan.
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> > Maybe it is a bit unclear so I will
> > enumerate
> > > > the
> > > > > > >>> fields
> > > > > > >>> > and
> > > > > > >>> > > > add
> > > > > > >>> > > > > > > them
> > > > > > >>> > > > > > > > to
> > > > > > >>> > > > > > > > >> > the KIP if that helps.
> > > > > > >>> > > > > > > > >> > Say we have producer ID x and epoch y.
> When
> > we
> > > > > > >>> overflow
> > > > > > >>> > > epoch
> > > > > > >>> > > > y
> > > > > > >>> > > > > we
> > > > > > >>> > > > > > > get
> > > > > > >>> > > > > > > > >> > producer ID Z.
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> > PREPARE
> > > > > > >>> > > > > > > > >> > producerId: x
> > > > > > >>> > > > > > > > >> > previous/lastProducerId (tagged field):
> > empty
> > > > > > >>> > > > > > > > >> > nextProducerId (tagged field): empty or z
> > if y
> > > > > will
> > > > > > >>> > overflow
> > > > > > >>> > > > > > > > >> > producerEpoch: y + 1
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> > COMPLETE
> > > > > > >>> > > > > > > > >> > producerId: x or z if y overflowed
> > > > > > >>> > > > > > > > >> > previous/lastProducerId (tagged field): x
> > > > > > >>> > > > > > > > >> > nextProducerId (tagged field): empty
> > > > > > >>> > > > > > > > >> > producerEpoch: y + 1 or 0 if we overflowed
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> > (2) In the prepare phase, if we retry and
> > see
> > > > > epoch
> > > > > > -
> > > > > > >>> 1 +
> > > > > > >>> > ID
> > > > > > >>> > > > in
> > > > > > >>> > > > > > last
> > > > > > >>> > > > > > > > >> seen
> > > > > > >>> > > > > > > > >> > fields and are issuing the same command
> (ie
> > > > commit
> > > > > > not
> > > > > > >>> > > abort),
> > > > > > >>> > > > > we
> > > > > > >>> > > > > > > > return
> > > > > > >>> > > > > > > > >> > success. The logic before KIP-890 seems to
> > > > return
> > > > > > >>> > > > > > > > >> CONCURRENT_TRANSACTIONS
> > > > > > >>> > > > > > > > >> > in this case. Are we intentionally making
> > this
> > > > > > change?
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> > Hmm -- we would fence the producer if the
> > > epoch
> > > > is
> > > > > > >>> bumped
> > > > > > >>> > > and
> > > > > > >>> > > > we
> > > > > > >>> > > > > > > get a
> > > > > > >>> > > > > > > > >> > lower epoch. Yes -- we are intentionally
> > > adding
> > > > > this
> > > > > > >>> to
> > > > > > >>> > > > prevent
> > > > > > >>> > > > > > > > fencing.
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> > 112. We already merged the code that adds
> > the
> > > > > > >>> VerifyOnly
> > > > > > >>> > > field
> > > > > > >>> > > > > in
> > > > > > >>> > > > > > > > >> > AddPartitionsToTxnRequest, which is an
> inter
> > > > > broker
> > > > > > >>> > request.
> > > > > > >>> > > > It
> > > > > > >>> > > > > > > seems
> > > > > > >>> > > > > > > > >> that
> > > > > > >>> > > > > > > > >> > we didn't bump up the IBP for that. Do you
> > > know
> > > > > why?
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> > We no longer need IBP for all interbroker
> > > > requests
> > > > > > as
> > > > > > >>> > > > > ApiVersions
> > > > > > >>> > > > > > > > should
> > > > > > >>> > > > > > > > >> > correctly gate versioning.
> > > > > > >>> > > > > > > > >> > We also handle unsupported version errors
> > > > > correctly
> > > > > > >>> if we
> > > > > > >>> > > > > receive
> > > > > > >>> > > > > > > them
> > > > > > >>> > > > > > > > >> in
> > > > > > >>> > > > > > > > >> > edge cases like upgrades/downgrades.
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> > Justine
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> > On Mon, Jan 22, 2024 at 11:00 AM Jun Rao
> > > > > > >>> > > > > <j...@confluent.io.invalid
> > > > > > >>> > > > > > >
> > > > > > >>> > > > > > > > >> wrote:
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >> > > Hi, Justine,
> > > > > > >>> > > > > > > > >> > >
> > > > > > >>> > > > > > > > >> > > Thanks for the reply.
> > > > > > >>> > > > > > > > >> > >
> > > > > > >>> > > > > > > > >> > > 101.3 I realized that I actually have
> two
> > > > > > questions.
> > > > > > >>> > > > > > > > >> > > (1) In the non-overflow case, we need to
> > > write
> > > > > the
> > > > > > >>> > > previous
> > > > > > >>> > > > > > > produce
> > > > > > >>> > > > > > > > Id
> > > > > > >>> > > > > > > > >> > > tagged field in the end maker so that we
> > > know
> > > > if
> > > > > > the
> > > > > > >>> > > marker
> > > > > > >>> > > > is
> > > > > > >>> > > > > > > from
> > > > > > >>> > > > > > > > >> the
> > > > > > >>> > > > > > > > >> > new
> > > > > > >>> > > > > > > > >> > > client. Since the end maker is derived
> > from
> > > > the
> > > > > > >>> prepare
> > > > > > >>> > > > > marker,
> > > > > > >>> > > > > > > > >> should we
> > > > > > >>> > > > > > > > >> > > write the previous produce Id in the
> > prepare
> > > > > > marker
> > > > > > >>> > field
> > > > > > >>> > > > too?
> > > > > > >>> > > > > > > > >> Otherwise,
> > > > > > >>> > > > > > > > >> > > we will lose this information when
> > deriving
> > > > the
> > > > > > end
> > > > > > >>> > > marker.
> > > > > > >>> > > > > > > > >> > > (2) In the prepare phase, if we retry
> and
> > > see
> > > > > > epoch
> > > > > > >>> - 1
> > > > > > >>> > +
> > > > > > >>> > > ID
> > > > > > >>> > > > > in
> > > > > > >>> > > > > > > last
> > > > > > >>> > > > > > > > >> seen
> > > > > > >>> > > > > > > > >> > > fields and are issuing the same command
> > (ie
> > > > > commit
> > > > > > >>> not
> > > > > > >>> > > > abort),
> > > > > > >>> > > > > > we
> > > > > > >>> > > > > > > > >> return
> > > > > > >>> > > > > > > > >> > > success. The logic before KIP-890 seems
> to
> > > > > return
> > > > > > >>> > > > > > > > >> CONCURRENT_TRANSACTIONS
> > > > > > >>> > > > > > > > >> > > in this case. Are we intentionally
> making
> > > this
> > > > > > >>> change?
> > > > > > >>> > > > > > > > >> > >
> > > > > > >>> > > > > > > > >> > > 112. We already merged the code that
> adds
> > > the
> > > > > > >>> VerifyOnly
> > > > > > >>> > > > field
> > > > > > >>> > > > > > in
> > > > > > >>> > > > > > > > >> > > AddPartitionsToTxnRequest, which is an
> > inter
> > > > > > broker
> > > > > > >>> > > request.
> > > > > > >>> > > > > It
> > > > > > >>> > > > > > > > seems
> > > > > > >>> > > > > > > > >> > that
> > > > > > >>> > > > > > > > >> > > we didn't bump up the IBP for that. Do
> you
> > > > know
> > > > > > why?
> > > > > > >>> > > > > > > > >> > >
> > > > > > >>> > > > > > > > >> > > Jun
> > > > > > >>> > > > > > > > >> > >
> > > > > > >>> > > > > > > > >> > > On Fri, Jan 19, 2024 at 4:50 PM Justine
> > > Olshan
> > > > > > >>> > > > > > > > >> > > <jols...@confluent.io.invalid>
> > > > > > >>> > > > > > > > >> > > wrote:
> > > > > > >>> > > > > > > > >> > >
> > > > > > >>> > > > > > > > >> > > > Hi Jun,
> > > > > > >>> > > > > > > > >> > > >
> > > > > > >>> > > > > > > > >> > > > 101.3 I can change "last seen" to
> > "current
> > > > > > >>> producer id
> > > > > > >>> > > and
> > > > > > >>> > > > > > > epoch"
> > > > > > >>> > > > > > > > if
> > > > > > >>> > > > > > > > >> > that
> > > > > > >>> > > > > > > > >> > > > was the part that was confusing
> > > > > > >>> > > > > > > > >> > > > 110 I can mention this
> > > > > > >>> > > > > > > > >> > > > 111 I can do that
> > > > > > >>> > > > > > > > >> > > > 112 We still need it. But I am still
> > > > > finalizing
> > > > > > >>> the
> > > > > > >>> > > > design.
> > > > > > >>> > > > > I
> > > > > > >>> > > > > > > will
> > > > > > >>> > > > > > > > >> > update
> > > > > > >>> > > > > > > > >> > > > the KIP once I get the information
> > > > finalized.
> > > > > > >>> Sorry
> > > > > > >>> > for
> > > > > > >>> > > > the
> > > > > > >>> > > > > > > > delays.
> > > > > > >>> > > > > > > > >> > > >
> > > > > > >>> > > > > > > > >> > > > Justine
> > > > > > >>> > > > > > > > >> > > >
> > > > > > >>> > > > > > > > >> > > > On Fri, Jan 19, 2024 at 10:50 AM Jun
> Rao
> > > > > > >>> > > > > > > <j...@confluent.io.invalid
> > > > > > >>> > > > > > > > >
> > > > > > >>> > > > > > > > >> > > wrote:
> > > > > > >>> > > > > > > > >> > > >
> > > > > > >>> > > > > > > > >> > > > > Hi, Justine,
> > > > > > >>> > > > > > > > >> > > > >
> > > > > > >>> > > > > > > > >> > > > > Thanks for the reply.
> > > > > > >>> > > > > > > > >> > > > >
> > > > > > >>> > > > > > > > >> > > > > 101.3 In the non-overflow case, the
> > > > previous
> > > > > > ID
> > > > > > >>> is
> > > > > > >>> > the
> > > > > > >>> > > > > same
> > > > > > >>> > > > > > as
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > produce
> > > > > > >>> > > > > > > > >> > > > > ID for the complete marker too, but
> we
> > > set
> > > > > the
> > > > > > >>> > > previous
> > > > > > >>> > > > ID
> > > > > > >>> > > > > > in
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > complete
> > > > > > >>> > > > > > > > >> > > > > marker. Earlier you mentioned that
> > this
> > > is
> > > > > to
> > > > > > >>> know
> > > > > > >>> > > that
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > >> marker is
> > > > > > >>> > > > > > > > >> > > > > written by the new client so that we
> > > could
> > > > > > >>> return
> > > > > > >>> > > > success
> > > > > > >>> > > > > on
> > > > > > >>> > > > > > > > >> retried
> > > > > > >>> > > > > > > > >> > > > > endMarker requests. I was trying to
> > > > > understand
> > > > > > >>> why
> > > > > > >>> > > this
> > > > > > >>> > > > is
> > > > > > >>> > > > > > not
> > > > > > >>> > > > > > > > >> needed
> > > > > > >>> > > > > > > > >> > > for
> > > > > > >>> > > > > > > > >> > > > > the prepare marker since retry can
> > > happen
> > > > in
> > > > > > the
> > > > > > >>> > > prepare
> > > > > > >>> > > > > > state
> > > > > > >>> > > > > > > > >> too.
> > > > > > >>> > > > > > > > >> > Is
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > reason that in the prepare state, we
> > > > return
> > > > > > >>> > > > > > > > >> CONCURRENT_TRANSACTIONS
> > > > > > >>> > > > > > > > >> > > > instead
> > > > > > >>> > > > > > > > >> > > > > of success on retried endMaker
> > requests?
> > > > If
> > > > > > so,
> > > > > > >>> > should
> > > > > > >>> > > > we
> > > > > > >>> > > > > > > change
> > > > > > >>> > > > > > > > >> "If
> > > > > > >>> > > > > > > > >> > we
> > > > > > >>> > > > > > > > >> > > > > retry and see epoch - 1 + ID in last
> > > seen
> > > > > > >>> fields and
> > > > > > >>> > > are
> > > > > > >>> > > > > > > issuing
> > > > > > >>> > > > > > > > >> the
> > > > > > >>> > > > > > > > >> > > same
> > > > > > >>> > > > > > > > >> > > > > command (ie commit not abort) we can
> > > > return
> > > > > > >>> (with
> > > > > > >>> > the
> > > > > > >>> > > > new
> > > > > > >>> > > > > > > > epoch)"
> > > > > > >>> > > > > > > > >> > > > > accordingly?
> > > > > > >>> > > > > > > > >> > > > >
> > > > > > >>> > > > > > > > >> > > > > 110. Yes, without this KIP, a
> delayed
> > > > > endMaker
> > > > > > >>> > request
> > > > > > >>> > > > > > carries
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> > same
> > > > > > >>> > > > > > > > >> > > > > epoch and won't be fenced. This can
> > > > > > >>> commit/abort a
> > > > > > >>> > > > future
> > > > > > >>> > > > > > > > >> transaction
> > > > > > >>> > > > > > > > >> > > > > unexpectedly. I am not sure if we
> have
> > > > seen
> > > > > > >>> this in
> > > > > > >>> > > > > practice
> > > > > > >>> > > > > > > > >> though.
> > > > > > >>> > > > > > > > >> > > > >
> > > > > > >>> > > > > > > > >> > > > > 111. Sounds good. It would be useful
> > to
> > > > make
> > > > > > it
> > > > > > >>> > clear
> > > > > > >>> > > > that
> > > > > > >>> > > > > > we
> > > > > > >>> > > > > > > > can
> > > > > > >>> > > > > > > > >> now
> > > > > > >>> > > > > > > > >> > > > > populate the lastSeen field from the
> > log
> > > > > > >>> reliably.
> > > > > > >>> > > > > > > > >> > > > >
> > > > > > >>> > > > > > > > >> > > > > 112. Yes, I was referring to
> > > > > > >>> > AddPartitionsToTxnRequest
> > > > > > >>> > > > > since
> > > > > > >>> > > > > > > > it's
> > > > > > >>> > > > > > > > >> > > called
> > > > > > >>> > > > > > > > >> > > > > across brokers and we are changing
> its
> > > > > schema.
> > > > > > >>> Are
> > > > > > >>> > you
> > > > > > >>> > > > > > saying
> > > > > > >>> > > > > > > we
> > > > > > >>> > > > > > > > >> > don't
> > > > > > >>> > > > > > > > >> > > > need
> > > > > > >>> > > > > > > > >> > > > > it any more? I thought that we
> already
> > > > > > >>> implemented
> > > > > > >>> > the
> > > > > > >>> > > > > > server
> > > > > > >>> > > > > > > > side
> > > > > > >>> > > > > > > > >> > > > > verification logic based on
> > > > > > >>> > AddPartitionsToTxnRequest
> > > > > > >>> > > > > across
> > > > > > >>> > > > > > > > >> brokers.
> > > > > > >>> > > > > > > > >> > > > >
> > > > > > >>> > > > > > > > >> > > > > Jun
> > > > > > >>> > > > > > > > >> > > > >
> > > > > > >>> > > > > > > > >> > > > >
> > > > > > >>> > > > > > > > >> > > > > On Thu, Jan 18, 2024 at 5:05 PM
> > Justine
> > > > > Olshan
> > > > > > >>> > > > > > > > >> > > > > <jols...@confluent.io.invalid>
> > > > > > >>> > > > > > > > >> > > > > wrote:
> > > > > > >>> > > > > > > > >> > > > >
> > > > > > >>> > > > > > > > >> > > > > > Hey Jun,
> > > > > > >>> > > > > > > > >> > > > > >
> > > > > > >>> > > > > > > > >> > > > > > 101.3 We don't set the previous ID
> > in
> > > > the
> > > > > > >>> Prepare
> > > > > > >>> > > > field
> > > > > > >>> > > > > > > since
> > > > > > >>> > > > > > > > we
> > > > > > >>> > > > > > > > >> > > don't
> > > > > > >>> > > > > > > > >> > > > > need
> > > > > > >>> > > > > > > > >> > > > > > it. It is the same producer ID as
> > the
> > > > main
> > > > > > >>> > producer
> > > > > > >>> > > ID
> > > > > > >>> > > > > > > field.
> > > > > > >>> > > > > > > > >> > > > > >
> > > > > > >>> > > > > > > > >> > > > > > 110 Hmm -- maybe I need to reread
> > your
> > > > > > message
> > > > > > >>> > about
> > > > > > >>> > > > > > delayed
> > > > > > >>> > > > > > > > >> > markers.
> > > > > > >>> > > > > > > > >> > > > If
> > > > > > >>> > > > > > > > >> > > > > we
> > > > > > >>> > > > > > > > >> > > > > > receive a delayed endTxn marker
> > after
> > > > the
> > > > > > >>> > > transaction
> > > > > > >>> > > > is
> > > > > > >>> > > > > > > > already
> > > > > > >>> > > > > > > > >> > > > > complete?
> > > > > > >>> > > > > > > > >> > > > > > So we will commit the next
> > transaction
> > > > > early
> > > > > > >>> > without
> > > > > > >>> > > > the
> > > > > > >>> > > > > > > fixes
> > > > > > >>> > > > > > > > >> in
> > > > > > >>> > > > > > > > >> > > part
> > > > > > >>> > > > > > > > >> > > > 2?
> > > > > > >>> > > > > > > > >> > > > > >
> > > > > > >>> > > > > > > > >> > > > > > 111 Yes -- this terminology was
> used
> > > in
> > > > a
> > > > > > >>> previous
> > > > > > >>> > > KIP
> > > > > > >>> > > > > and
> > > > > > >>> > > > > > > > never
> > > > > > >>> > > > > > > > >> > > > > > implemented it in the log -- only
> in
> > > > > memory
> > > > > > >>> > > > > > > > >> > > > > >
> > > > > > >>> > > > > > > > >> > > > > > 112 Hmm -- which interbroker
> > protocol
> > > > are
> > > > > > you
> > > > > > >>> > > > referring
> > > > > > >>> > > > > > to?
> > > > > > >>> > > > > > > I
> > > > > > >>> > > > > > > > am
> > > > > > >>> > > > > > > > >> > > > working
> > > > > > >>> > > > > > > > >> > > > > on
> > > > > > >>> > > > > > > > >> > > > > > the design for the work to remove
> > the
> > > > > extra
> > > > > > >>> add
> > > > > > >>> > > > > partitions
> > > > > > >>> > > > > > > > call
> > > > > > >>> > > > > > > > >> > and I
> > > > > > >>> > > > > > > > >> > > > > right
> > > > > > >>> > > > > > > > >> > > > > > now the design bumps MV. I have
> yet
> > to
> > > > > > update
> > > > > > >>> that
> > > > > > >>> > > > > section
> > > > > > >>> > > > > > > as
> > > > > > >>> > > > > > > > I
> > > > > > >>> > > > > > > > >> > > > finalize
> > > > > > >>> > > > > > > > >> > > > > > the design so please stay tuned.
> Was
> > > > there
> > > > > > >>> > anything
> > > > > > >>> > > > else
> > > > > > >>> > > > > > you
> > > > > > >>> > > > > > > > >> > thought
> > > > > > >>> > > > > > > > >> > > > > needed
> > > > > > >>> > > > > > > > >> > > > > > MV bump?
> > > > > > >>> > > > > > > > >> > > > > >
> > > > > > >>> > > > > > > > >> > > > > > Justine
> > > > > > >>> > > > > > > > >> > > > > >
> > > > > > >>> > > > > > > > >> > > > > > On Thu, Jan 18, 2024 at 3:07 PM
> Jun
> > > Rao
> > > > > > >>> > > > > > > > >> <j...@confluent.io.invalid>
> > > > > > >>> > > > > > > > >> > > > > wrote:
> > > > > > >>> > > > > > > > >> > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > Hi, Justine,
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > I don't see this create any
> issue.
> > > It
> > > > > just
> > > > > > >>> makes
> > > > > > >>> > > it
> > > > > > >>> > > > a
> > > > > > >>> > > > > > bit
> > > > > > >>> > > > > > > > >> hard to
> > > > > > >>> > > > > > > > >> > > > > explain
> > > > > > >>> > > > > > > > >> > > > > > > what this non-tagged produce id
> > > field
> > > > > > >>> means. We
> > > > > > >>> > > are
> > > > > > >>> > > > > > > > >> essentially
> > > > > > >>> > > > > > > > >> > > > trying
> > > > > > >>> > > > > > > > >> > > > > to
> > > > > > >>> > > > > > > > >> > > > > > > combine two actions (completing
> a
> > > txn
> > > > > and
> > > > > > >>> init a
> > > > > > >>> > > new
> > > > > > >>> > > > > > > produce
> > > > > > >>> > > > > > > > >> Id)
> > > > > > >>> > > > > > > > >> > > in a
> > > > > > >>> > > > > > > > >> > > > > > > single record. But, this may be
> > fine
> > > > > too.
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > A few other follow up comments.
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > 101.3 I guess the reason that we
> > > only
> > > > > set
> > > > > > >>> the
> > > > > > >>> > > > previous
> > > > > > >>> > > > > > > > >> produce id
> > > > > > >>> > > > > > > > >> > > > > tagged
> > > > > > >>> > > > > > > > >> > > > > > > field in the complete marker,
> but
> > > not
> > > > in
> > > > > > the
> > > > > > >>> > > prepare
> > > > > > >>> > > > > > > marker,
> > > > > > >>> > > > > > > > >> is
> > > > > > >>> > > > > > > > >> > > that
> > > > > > >>> > > > > > > > >> > > > in
> > > > > > >>> > > > > > > > >> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > prepare state, we always return
> > > > > > >>> > > > > CONCURRENT_TRANSACTIONS
> > > > > > >>> > > > > > on
> > > > > > >>> > > > > > > > >> > retried
> > > > > > >>> > > > > > > > >> > > > > > endMaker
> > > > > > >>> > > > > > > > >> > > > > > > requests?
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > 110. "I believe your second
> point
> > is
> > > > > > >>> mentioned
> > > > > > >>> > in
> > > > > > >>> > > > the
> > > > > > >>> > > > > > > KIP. I
> > > > > > >>> > > > > > > > >> can
> > > > > > >>> > > > > > > > >> > > add
> > > > > > >>> > > > > > > > >> > > > > more
> > > > > > >>> > > > > > > > >> > > > > > > text on
> > > > > > >>> > > > > > > > >> > > > > > > this if it is helpful.
> > > > > > >>> > > > > > > > >> > > > > > > > The delayed message case can
> > also
> > > > > > violate
> > > > > > >>> EOS
> > > > > > >>> > if
> > > > > > >>> > > > the
> > > > > > >>> > > > > > > > delayed
> > > > > > >>> > > > > > > > >> > > > message
> > > > > > >>> > > > > > > > >> > > > > > > comes in after the next
> > > > > addPartitionsToTxn
> > > > > > >>> > request
> > > > > > >>> > > > > comes
> > > > > > >>> > > > > > > in.
> > > > > > >>> > > > > > > > >> > > > > Effectively
> > > > > > >>> > > > > > > > >> > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > may see a message from a
> previous
> > > > > > (aborted)
> > > > > > >>> > > > > transaction
> > > > > > >>> > > > > > > > become
> > > > > > >>> > > > > > > > >> > part
> > > > > > >>> > > > > > > > >> > > > of
> > > > > > >>> > > > > > > > >> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > next transaction."
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > The above is the case when a
> > delayed
> > > > > > >>> message is
> > > > > > >>> > > > > appended
> > > > > > >>> > > > > > > to
> > > > > > >>> > > > > > > > >> the
> > > > > > >>> > > > > > > > >> > > data
> > > > > > >>> > > > > > > > >> > > > > > > partition. What I mentioned is a
> > > > > slightly
> > > > > > >>> > > different
> > > > > > >>> > > > > case
> > > > > > >>> > > > > > > > when
> > > > > > >>> > > > > > > > >> a
> > > > > > >>> > > > > > > > >> > > > delayed
> > > > > > >>> > > > > > > > >> > > > > > > marker is appended to the
> > > transaction
> > > > > log
> > > > > > >>> > > partition.
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > 111. The KIP says "Once we move
> > past
> > > > the
> > > > > > >>> Prepare
> > > > > > >>> > > and
> > > > > > >>> > > > > > > > Complete
> > > > > > >>> > > > > > > > >> > > states,
> > > > > > >>> > > > > > > > >> > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > don’t need to worry about
> lastSeen
> > > > > fields
> > > > > > >>> and
> > > > > > >>> > > clear
> > > > > > >>> > > > > > them,
> > > > > > >>> > > > > > > > just
> > > > > > >>> > > > > > > > >> > > handle
> > > > > > >>> > > > > > > > >> > > > > > state
> > > > > > >>> > > > > > > > >> > > > > > > transitions as normal.". Is the
> > > > lastSeen
> > > > > > >>> field
> > > > > > >>> > the
> > > > > > >>> > > > > same
> > > > > > >>> > > > > > as
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > previous
> > > > > > >>> > > > > > > > >> > > > > > > Produce Id tagged field in
> > > > > > >>> TransactionLogValue?
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > 112. Since the kip changes the
> > > > > > inter-broker
> > > > > > >>> > > > protocol,
> > > > > > >>> > > > > > > should
> > > > > > >>> > > > > > > > >> we
> > > > > > >>> > > > > > > > >> > > bump
> > > > > > >>> > > > > > > > >> > > > up
> > > > > > >>> > > > > > > > >> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > MV/IBP version? Is this feature
> > only
> > > > for
> > > > > > the
> > > > > > >>> > KRaft
> > > > > > >>> > > > > mode?
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > Thanks,
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > Jun
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > On Wed, Jan 17, 2024 at 11:13 AM
> > > > Justine
> > > > > > >>> Olshan
> > > > > > >>> > > > > > > > >> > > > > > > <jols...@confluent.io.invalid>
> > > wrote:
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > Hey Jun,
> > > > > > >>> > > > > > > > >> > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > I'm glad we are getting to
> > > > convergence
> > > > > > on
> > > > > > >>> the
> > > > > > >>> > > > > design.
> > > > > > >>> > > > > > :)
> > > > > > >>> > > > > > > > >> > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > While I understand it seems a
> > > little
> > > > > > >>> "weird".
> > > > > > >>> > > I'm
> > > > > > >>> > > > > not
> > > > > > >>> > > > > > > sure
> > > > > > >>> > > > > > > > >> what
> > > > > > >>> > > > > > > > >> > > the
> > > > > > >>> > > > > > > > >> > > > > > > benefit
> > > > > > >>> > > > > > > > >> > > > > > > > of writing an extra record to
> > the
> > > > log.
> > > > > > >>> > > > > > > > >> > > > > > > > Is the concern a tool to
> > describe
> > > > > > >>> transactions
> > > > > > >>> > > > won't
> > > > > > >>> > > > > > > work
> > > > > > >>> > > > > > > > >> (ie,
> > > > > > >>> > > > > > > > >> > > the
> > > > > > >>> > > > > > > > >> > > > > > > complete
> > > > > > >>> > > > > > > > >> > > > > > > > state is needed to calculate
> the
> > > > time
> > > > > > >>> since
> > > > > > >>> > the
> > > > > > >>> > > > > > > > transaction
> > > > > > >>> > > > > > > > >> > > > > completed?)
> > > > > > >>> > > > > > > > >> > > > > > > > If we have a reason like this,
> > it
> > > is
> > > > > > >>> enough to
> > > > > > >>> > > > > > convince
> > > > > > >>> > > > > > > me
> > > > > > >>> > > > > > > > >> we
> > > > > > >>> > > > > > > > >> > > need
> > > > > > >>> > > > > > > > >> > > > > such
> > > > > > >>> > > > > > > > >> > > > > > > an
> > > > > > >>> > > > > > > > >> > > > > > > > extra record. It seems like it
> > > would
> > > > > be
> > > > > > >>> > > replacing
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > record
> > > > > > >>> > > > > > > > >> > > > written
> > > > > > >>> > > > > > > > >> > > > > on
> > > > > > >>> > > > > > > > >> > > > > > > > InitProducerId. Is this
> correct?
> > > > > > >>> > > > > > > > >> > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > Thanks,
> > > > > > >>> > > > > > > > >> > > > > > > > Justine
> > > > > > >>> > > > > > > > >> > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > On Tue, Jan 16, 2024 at
> 5:14 PM
> > > Jun
> > > > > Rao
> > > > > > >>> > > > > > > > >> > <j...@confluent.io.invalid
> > > > > > >>> > > > > > > > >> > > >
> > > > > > >>> > > > > > > > >> > > > > > > wrote:
> > > > > > >>> > > > > > > > >> > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > Hi, Justine,
> > > > > > >>> > > > > > > > >> > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > Thanks for the explanation.
> I
> > > > > > >>> understand the
> > > > > > >>> > > > > > intention
> > > > > > >>> > > > > > > > >> now.
> > > > > > >>> > > > > > > > >> > In
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > overflow
> > > > > > >>> > > > > > > > >> > > > > > > > > case, we set the non-tagged
> > > field
> > > > to
> > > > > > >>> the old
> > > > > > >>> > > pid
> > > > > > >>> > > > > > (and
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> max
> > > > > > >>> > > > > > > > >> > > > > epoch)
> > > > > > >>> > > > > > > > >> > > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > prepare marker so that we
> > could
> > > > > > >>> correctly
> > > > > > >>> > > write
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > >> marker to
> > > > > > >>> > > > > > > > >> > > the
> > > > > > >>> > > > > > > > >> > > > > > data
> > > > > > >>> > > > > > > > >> > > > > > > > > partition if the broker
> > > > downgrades.
> > > > > > When
> > > > > > >>> > > writing
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > >> complete
> > > > > > >>> > > > > > > > >> > > > > marker,
> > > > > > >>> > > > > > > > >> > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > know the marker has already
> > been
> > > > > > >>> written to
> > > > > > >>> > > the
> > > > > > >>> > > > > data
> > > > > > >>> > > > > > > > >> > partition.
> > > > > > >>> > > > > > > > >> > > > We
> > > > > > >>> > > > > > > > >> > > > > > set
> > > > > > >>> > > > > > > > >> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > non-tagged field to the new
> > pid
> > > to
> > > > > > avoid
> > > > > > >>> > > > > > > > >> > > > InvalidPidMappingException
> > > > > > >>> > > > > > > > >> > > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > client if the broker
> > downgrades.
> > > > > > >>> > > > > > > > >> > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > The above seems to work.
> It's
> > > > just a
> > > > > > bit
> > > > > > >>> > > > > > inconsistent
> > > > > > >>> > > > > > > > for
> > > > > > >>> > > > > > > > >> a
> > > > > > >>> > > > > > > > >> > > > prepare
> > > > > > >>> > > > > > > > >> > > > > > > > marker
> > > > > > >>> > > > > > > > >> > > > > > > > > and a complete marker to use
> > > > > different
> > > > > > >>> pids
> > > > > > >>> > in
> > > > > > >>> > > > > this
> > > > > > >>> > > > > > > > >> special
> > > > > > >>> > > > > > > > >> > > case.
> > > > > > >>> > > > > > > > >> > > > > If
> > > > > > >>> > > > > > > > >> > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > downgrade with the complete
> > > > marker,
> > > > > it
> > > > > > >>> seems
> > > > > > >>> > > > that
> > > > > > >>> > > > > we
> > > > > > >>> > > > > > > > will
> > > > > > >>> > > > > > > > >> > never
> > > > > > >>> > > > > > > > >> > > > be
> > > > > > >>> > > > > > > > >> > > > > > able
> > > > > > >>> > > > > > > > >> > > > > > > > to
> > > > > > >>> > > > > > > > >> > > > > > > > > write the complete marker
> with
> > > the
> > > > > old
> > > > > > >>> pid.
> > > > > > >>> > > Not
> > > > > > >>> > > > > sure
> > > > > > >>> > > > > > > if
> > > > > > >>> > > > > > > > it
> > > > > > >>> > > > > > > > >> > > causes
> > > > > > >>> > > > > > > > >> > > > > any
> > > > > > >>> > > > > > > > >> > > > > > > > > issue, but it seems a bit
> > weird.
> > > > > > >>> Instead of
> > > > > > >>> > > > > writing
> > > > > > >>> > > > > > > the
> > > > > > >>> > > > > > > > >> > > complete
> > > > > > >>> > > > > > > > >> > > > > > marker
> > > > > > >>> > > > > > > > >> > > > > > > > > with the new pid, could we
> > write
> > > > two
> > > > > > >>> > records:
> > > > > > >>> > > a
> > > > > > >>> > > > > > > complete
> > > > > > >>> > > > > > > > >> > marker
> > > > > > >>> > > > > > > > >> > > > > with
> > > > > > >>> > > > > > > > >> > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > old pid followed by a
> > > > > > >>> TransactionLogValue
> > > > > > >>> > with
> > > > > > >>> > > > the
> > > > > > >>> > > > > > new
> > > > > > >>> > > > > > > > pid
> > > > > > >>> > > > > > > > >> > and
> > > > > > >>> > > > > > > > >> > > an
> > > > > > >>> > > > > > > > >> > > > > > empty
> > > > > > >>> > > > > > > > >> > > > > > > > > state? We could make the two
> > > > records
> > > > > > in
> > > > > > >>> the
> > > > > > >>> > > same
> > > > > > >>> > > > > > batch
> > > > > > >>> > > > > > > > so
> > > > > > >>> > > > > > > > >> > that
> > > > > > >>> > > > > > > > >> > > > they
> > > > > > >>> > > > > > > > >> > > > > > > will
> > > > > > >>> > > > > > > > >> > > > > > > > be
> > > > > > >>> > > > > > > > >> > > > > > > > > added to the log atomically.
> > > > > > >>> > > > > > > > >> > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > Thanks,
> > > > > > >>> > > > > > > > >> > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > Jun
> > > > > > >>> > > > > > > > >> > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > On Fri, Jan 12, 2024 at
> > 5:40 PM
> > > > > > Justine
> > > > > > >>> > Olshan
> > > > > > >>> > > > > > > > >> > > > > > > > >
> <jols...@confluent.io.invalid
> > >
> > > > > > >>> > > > > > > > >> > > > > > > > > wrote:
> > > > > > >>> > > > > > > > >> > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > (1) the prepare marker is
> > > > written,
> > > > > > >>> but the
> > > > > > >>> > > > > endTxn
> > > > > > >>> > > > > > > > >> response
> > > > > > >>> > > > > > > > >> > is
> > > > > > >>> > > > > > > > >> > > > not
> > > > > > >>> > > > > > > > >> > > > > > > > > received
> > > > > > >>> > > > > > > > >> > > > > > > > > > by the client when the
> > server
> > > > > > >>> downgrades
> > > > > > >>> > > > > > > > >> > > > > > > > > > (2)  the prepare marker is
> > > > > written,
> > > > > > >>> the
> > > > > > >>> > > endTxn
> > > > > > >>> > > > > > > > response
> > > > > > >>> > > > > > > > >> is
> > > > > > >>> > > > > > > > >> > > > > received
> > > > > > >>> > > > > > > > >> > > > > > > by
> > > > > > >>> > > > > > > > >> > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > client when the server
> > > > downgrades.
> > > > > > >>> > > > > > > > >> > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > I think I am still a
> little
> > > > > > confused.
> > > > > > >>> In
> > > > > > >>> > > both
> > > > > > >>> > > > of
> > > > > > >>> > > > > > > these
> > > > > > >>> > > > > > > > >> > cases,
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > transaction log has the
> old
> > > > > producer
> > > > > > >>> ID.
> > > > > > >>> > We
> > > > > > >>> > > > > don't
> > > > > > >>> > > > > > > > write
> > > > > > >>> > > > > > > > >> the
> > > > > > >>> > > > > > > > >> > > new
> > > > > > >>> > > > > > > > >> > > > > > > > producer
> > > > > > >>> > > > > > > > >> > > > > > > > > ID
> > > > > > >>> > > > > > > > >> > > > > > > > > > in the prepare marker's
> non
> > > > tagged
> > > > > > >>> fields.
> > > > > > >>> > > > > > > > >> > > > > > > > > > If the server downgrades
> > now,
> > > it
> > > > > > would
> > > > > > >>> > read
> > > > > > >>> > > > the
> > > > > > >>> > > > > > > > records
> > > > > > >>> > > > > > > > >> not
> > > > > > >>> > > > > > > > >> > > in
> > > > > > >>> > > > > > > > >> > > > > > tagged
> > > > > > >>> > > > > > > > >> > > > > > > > > > fields and the complete
> > marker
> > > > > will
> > > > > > >>> also
> > > > > > >>> > > have
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > old
> > > > > > >>> > > > > > > > >> > > producer
> > > > > > >>> > > > > > > > >> > > > > ID.
> > > > > > >>> > > > > > > > >> > > > > > > > > > (If we had used the new
> > > producer
> > > > > ID,
> > > > > > >>> we
> > > > > > >>> > > would
> > > > > > >>> > > > > not
> > > > > > >>> > > > > > > have
> > > > > > >>> > > > > > > > >> > > > > > transactional
> > > > > > >>> > > > > > > > >> > > > > > > > > > correctness since the
> > producer
> > > > id
> > > > > > >>> doesn't
> > > > > > >>> > > > match
> > > > > > >>> > > > > > the
> > > > > > >>> > > > > > > > >> > > transaction
> > > > > > >>> > > > > > > > >> > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > state would not be correct
> > on
> > > > the
> > > > > > data
> > > > > > >>> > > > > partition.)
> > > > > > >>> > > > > > > > >> > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > In the overflow case, I'd
> > > expect
> > > > > the
> > > > > > >>> > > following
> > > > > > >>> > > > > to
> > > > > > >>> > > > > > > > >> happen on
> > > > > > >>> > > > > > > > >> > > the
> > > > > > >>> > > > > > > > >> > > > > > > client
> > > > > > >>> > > > > > > > >> > > > > > > > > side
> > > > > > >>> > > > > > > > >> > > > > > > > > > Case 1  -- we retry EndTxn
> > --
> > > it
> > > > > is
> > > > > > >>> the
> > > > > > >>> > same
> > > > > > >>> > > > > > > producer
> > > > > > >>> > > > > > > > ID
> > > > > > >>> > > > > > > > >> > and
> > > > > > >>> > > > > > > > >> > > > > epoch
> > > > > > >>> > > > > > > > >> > > > > > -
> > > > > > >>> > > > > > > > >> > > > > > > 1
> > > > > > >>> > > > > > > > >> > > > > > > > > this
> > > > > > >>> > > > > > > > >> > > > > > > > > > would fence the producer
> > > > > > >>> > > > > > > > >> > > > > > > > > > Case 2 -- we don't retry
> > > EndTxn
> > > > > and
> > > > > > >>> use
> > > > > > >>> > the
> > > > > > >>> > > > new
> > > > > > >>> > > > > > > > >> producer id
> > > > > > >>> > > > > > > > >> > > > which
> > > > > > >>> > > > > > > > >> > > > > > > would
> > > > > > >>> > > > > > > > >> > > > > > > > > > result in
> > > > > InvalidPidMappingException
> > > > > > >>> > > > > > > > >> > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > Maybe we can have special
> > > > handling
> > > > > > for
> > > > > > >>> > when
> > > > > > >>> > > a
> > > > > > >>> > > > > > server
> > > > > > >>> > > > > > > > >> > > > downgrades.
> > > > > > >>> > > > > > > > >> > > > > > When
> > > > > > >>> > > > > > > > >> > > > > > > > it
> > > > > > >>> > > > > > > > >> > > > > > > > > > reconnects we could get an
> > API
> > > > > > version
> > > > > > >>> > > request
> > > > > > >>> > > > > > > showing
> > > > > > >>> > > > > > > > >> > > KIP-890
> > > > > > >>> > > > > > > > >> > > > > > part 2
> > > > > > >>> > > > > > > > >> > > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > not supported. In that
> case,
> > > we
> > > > > can
> > > > > > >>> call
> > > > > > >>> > > > > > > > initProducerId
> > > > > > >>> > > > > > > > >> to
> > > > > > >>> > > > > > > > >> > > > abort
> > > > > > >>> > > > > > > > >> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > transaction. (In the
> > overflow
> > > > > case,
> > > > > > >>> this
> > > > > > >>> > > > > correctly
> > > > > > >>> > > > > > > > gives
> > > > > > >>> > > > > > > > >> > us a
> > > > > > >>> > > > > > > > >> > > > new
> > > > > > >>> > > > > > > > >> > > > > > > > > producer
> > > > > > >>> > > > > > > > >> > > > > > > > > > ID)
> > > > > > >>> > > > > > > > >> > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > I guess the corresponding
> > case
> > > > > would
> > > > > > >>> be
> > > > > > >>> > > where
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > >> *complete
> > > > > > >>> > > > > > > > >> > > > > marker
> > > > > > >>> > > > > > > > >> > > > > > > *is
> > > > > > >>> > > > > > > > >> > > > > > > > > > written but the endTxn is
> > not
> > > > > > >>> received by
> > > > > > >>> > > the
> > > > > > >>> > > > > > client
> > > > > > >>> > > > > > > > and
> > > > > > >>> > > > > > > > >> > the
> > > > > > >>> > > > > > > > >> > > > > server
> > > > > > >>> > > > > > > > >> > > > > > > > > > downgrades? This would
> > result
> > > in
> > > > > the
> > > > > > >>> > > > transaction
> > > > > > >>> > > > > > > > >> > coordinator
> > > > > > >>> > > > > > > > >> > > > > having
> > > > > > >>> > > > > > > > >> > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > new
> > > > > > >>> > > > > > > > >> > > > > > > > > > ID and not the old one.
> If
> > > the
> > > > > > client
> > > > > > >>> > > > retries,
> > > > > > >>> > > > > it
> > > > > > >>> > > > > > > > will
> > > > > > >>> > > > > > > > >> > > receive
> > > > > > >>> > > > > > > > >> > > > > an
> > > > > > >>> > > > > > > > >> > > > > > > > > >
> InvalidPidMappingException.
> > > The
> > > > > > >>> > > InitProducerId
> > > > > > >>> > > > > > > > scenario
> > > > > > >>> > > > > > > > >> > above
> > > > > > >>> > > > > > > > >> > > > > would
> > > > > > >>> > > > > > > > >> > > > > > > > help
> > > > > > >>> > > > > > > > >> > > > > > > > > > here too.
> > > > > > >>> > > > > > > > >> > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > To be clear, my
> > compatibility
> > > > > story
> > > > > > is
> > > > > > >>> > meant
> > > > > > >>> > > > to
> > > > > > >>> > > > > > > > support
> > > > > > >>> > > > > > > > >> > > > > downgrades
> > > > > > >>> > > > > > > > >> > > > > > > > server
> > > > > > >>> > > > > > > > >> > > > > > > > > > side in keeping the
> > > > transactional
> > > > > > >>> > > correctness.
> > > > > > >>> > > > > > > Keeping
> > > > > > >>> > > > > > > > >> the
> > > > > > >>> > > > > > > > >> > > > client
> > > > > > >>> > > > > > > > >> > > > > > > from
> > > > > > >>> > > > > > > > >> > > > > > > > > > fencing itself is not the
> > > > > priority.
> > > > > > >>> > > > > > > > >> > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > Hope this helps. I can
> also
> > > add
> > > > > text
> > > > > > >>> in
> > > > > > >>> > the
> > > > > > >>> > > > KIP
> > > > > > >>> > > > > > > about
> > > > > > >>> > > > > > > > >> > > > > > InitProducerId
> > > > > > >>> > > > > > > > >> > > > > > > if
> > > > > > >>> > > > > > > > >> > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > think that fixes some edge
> > > > cases.
> > > > > > >>> > > > > > > > >> > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > Justine
> > > > > > >>> > > > > > > > >> > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > On Fri, Jan 12, 2024 at
> > > 4:10 PM
> > > > > Jun
> > > > > > >>> Rao
> > > > > > >>> > > > > > > > >> > > > <j...@confluent.io.invalid
> > > > > > >>> > > > > > > > >> > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > wrote:
> > > > > > >>> > > > > > > > >> > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > Hi, Justine,
> > > > > > >>> > > > > > > > >> > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > Thanks for the reply.
> > > > > > >>> > > > > > > > >> > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > I agree that we don't
> need
> > > to
> > > > > > >>> optimize
> > > > > > >>> > for
> > > > > > >>> > > > > > fencing
> > > > > > >>> > > > > > > > >> during
> > > > > > >>> > > > > > > > >> > > > > > > downgrades.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > Regarding consistency,
> > there
> > > > are
> > > > > > two
> > > > > > >>> > > > possible
> > > > > > >>> > > > > > > cases:
> > > > > > >>> > > > > > > > >> (1)
> > > > > > >>> > > > > > > > >> > > the
> > > > > > >>> > > > > > > > >> > > > > > > prepare
> > > > > > >>> > > > > > > > >> > > > > > > > > > marker
> > > > > > >>> > > > > > > > >> > > > > > > > > > > is written, but the
> endTxn
> > > > > > response
> > > > > > >>> is
> > > > > > >>> > not
> > > > > > >>> > > > > > > received
> > > > > > >>> > > > > > > > by
> > > > > > >>> > > > > > > > >> > the
> > > > > > >>> > > > > > > > >> > > > > client
> > > > > > >>> > > > > > > > >> > > > > > > > when
> > > > > > >>> > > > > > > > >> > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > server downgrades; (2)
> > the
> > > > > > prepare
> > > > > > >>> > marker
> > > > > > >>> > > > is
> > > > > > >>> > > > > > > > written,
> > > > > > >>> > > > > > > > >> > the
> > > > > > >>> > > > > > > > >> > > > > endTxn
> > > > > > >>> > > > > > > > >> > > > > > > > > > response
> > > > > > >>> > > > > > > > >> > > > > > > > > > > is received by the
> client
> > > when
> > > > > the
> > > > > > >>> > server
> > > > > > >>> > > > > > > > downgrades.
> > > > > > >>> > > > > > > > >> In
> > > > > > >>> > > > > > > > >> > > (1),
> > > > > > >>> > > > > > > > >> > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > client
> > > > > > >>> > > > > > > > >> > > > > > > > > > > will have the old
> produce
> > Id
> > > > and
> > > > > > in
> > > > > > >>> (2),
> > > > > > >>> > > the
> > > > > > >>> > > > > > > client
> > > > > > >>> > > > > > > > >> will
> > > > > > >>> > > > > > > > >> > > have
> > > > > > >>> > > > > > > > >> > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > new
> > > > > > >>> > > > > > > > >> > > > > > > > > > > produce Id. If we
> > downgrade
> > > > > right
> > > > > > >>> after
> > > > > > >>> > > the
> > > > > > >>> > > > > > > prepare
> > > > > > >>> > > > > > > > >> > marker,
> > > > > > >>> > > > > > > > >> > > > we
> > > > > > >>> > > > > > > > >> > > > > > > can't
> > > > > > >>> > > > > > > > >> > > > > > > > be
> > > > > > >>> > > > > > > > >> > > > > > > > > > > consistent to both (1)
> and
> > > (2)
> > > > > > >>> since we
> > > > > > >>> > > can
> > > > > > >>> > > > > only
> > > > > > >>> > > > > > > put
> > > > > > >>> > > > > > > > >> one
> > > > > > >>> > > > > > > > >> > > > value
> > > > > > >>> > > > > > > > >> > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > existing produce Id
> field.
> > > > It's
> > > > > > >>> also not
> > > > > > >>> > > > clear
> > > > > > >>> > > > > > > which
> > > > > > >>> > > > > > > > >> case
> > > > > > >>> > > > > > > > >> > > is
> > > > > > >>> > > > > > > > >> > > > > more
> > > > > > >>> > > > > > > > >> > > > > > > > > likely.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > So we could probably be
> > > > > consistent
> > > > > > >>> with
> > > > > > >>> > > > either
> > > > > > >>> > > > > > > case.
> > > > > > >>> > > > > > > > >> By
> > > > > > >>> > > > > > > > >> > > > putting
> > > > > > >>> > > > > > > > >> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > new
> > > > > > >>> > > > > > > > >> > > > > > > > > > > producer Id in the
> prepare
> > > > > marker,
> > > > > > >>> we
> > > > > > >>> > are
> > > > > > >>> > > > > > > consistent
> > > > > > >>> > > > > > > > >> with
> > > > > > >>> > > > > > > > >> > > > case
> > > > > > >>> > > > > > > > >> > > > > > (2)
> > > > > > >>> > > > > > > > >> > > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > it
> > > > > > >>> > > > > > > > >> > > > > > > > > > > also has the slight
> > benefit
> > > > that
> > > > > > the
> > > > > > >>> > > produce
> > > > > > >>> > > > > > field
> > > > > > >>> > > > > > > > in
> > > > > > >>> > > > > > > > >> the
> > > > > > >>> > > > > > > > >> > > > > prepare
> > > > > > >>> > > > > > > > >> > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > > complete marker are
> > > consistent
> > > > > in
> > > > > > >>> the
> > > > > > >>> > > > overflow
> > > > > > >>> > > > > > > case.
> > > > > > >>> > > > > > > > >> > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > Jun
> > > > > > >>> > > > > > > > >> > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > On Fri, Jan 12, 2024 at
> > > > 3:11 PM
> > > > > > >>> Justine
> > > > > > >>> > > > Olshan
> > > > > > >>> > > > > > > > >> > > > > > > > > > >
> > > <jols...@confluent.io.invalid
> > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > wrote:
> > > > > > >>> > > > > > > > >> > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > Hi Jun,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > In the case you
> > describe,
> > > we
> > > > > > would
> > > > > > >>> > need
> > > > > > >>> > > to
> > > > > > >>> > > > > > have
> > > > > > >>> > > > > > > a
> > > > > > >>> > > > > > > > >> > delayed
> > > > > > >>> > > > > > > > >> > > > > > > request,
> > > > > > >>> > > > > > > > >> > > > > > > > > > send a
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > successful EndTxn,
> and a
> > > > > > >>> successful
> > > > > > >>> > > > > > > > >> AddPartitionsToTxn
> > > > > > >>> > > > > > > > >> > > and
> > > > > > >>> > > > > > > > >> > > > > then
> > > > > > >>> > > > > > > > >> > > > > > > > have
> > > > > > >>> > > > > > > > >> > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > delayed EndTxn request
> > go
> > > > > > through
> > > > > > >>> for
> > > > > > >>> > a
> > > > > > >>> > > > > given
> > > > > > >>> > > > > > > > >> producer.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > I'm trying to figure
> out
> > > if
> > > > it
> > > > > > is
> > > > > > >>> > > possible
> > > > > > >>> > > > > for
> > > > > > >>> > > > > > > the
> > > > > > >>> > > > > > > > >> > client
> > > > > > >>> > > > > > > > >> > > > to
> > > > > > >>> > > > > > > > >> > > > > > > > > transition
> > > > > > >>> > > > > > > > >> > > > > > > > > > > if
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > a previous request is
> > > > delayed
> > > > > > >>> > somewhere.
> > > > > > >>> > > > But
> > > > > > >>> > > > > > > yes,
> > > > > > >>> > > > > > > > in
> > > > > > >>> > > > > > > > >> > this
> > > > > > >>> > > > > > > > >> > > > > case
> > > > > > >>> > > > > > > > >> > > > > > I
> > > > > > >>> > > > > > > > >> > > > > > > > > think
> > > > > > >>> > > > > > > > >> > > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > would fence the
> client.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > Not for the overflow
> > case.
> > > > In
> > > > > > the
> > > > > > >>> > > overflow
> > > > > > >>> > > > > > case,
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > producer
> > > > > > >>> > > > > > > > >> > > > > > ID
> > > > > > >>> > > > > > > > >> > > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > epoch are different on
> > the
> > > > > > marker
> > > > > > >>> and
> > > > > > >>> > on
> > > > > > >>> > > > the
> > > > > > >>> > > > > > new
> > > > > > >>> > > > > > > > >> > > > transaction.
> > > > > > >>> > > > > > > > >> > > > > > So
> > > > > > >>> > > > > > > > >> > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > want
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > the marker to use the
> > max
> > > > > epoch
> > > > > > >>> but
> > > > > > >>> > the
> > > > > > >>> > > > new
> > > > > > >>> > > > > > > > >> > transaction
> > > > > > >>> > > > > > > > >> > > > > should
> > > > > > >>> > > > > > > > >> > > > > > > > start
> > > > > > >>> > > > > > > > >> > > > > > > > > > > with
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > the new ID and epoch 0
> > in
> > > > the
> > > > > > >>> > > > transactional
> > > > > > >>> > > > > > > state.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > In the server
> downgrade
> > > > case,
> > > > > we
> > > > > > >>> want
> > > > > > >>> > to
> > > > > > >>> > > > see
> > > > > > >>> > > > > > the
> > > > > > >>> > > > > > > > >> > producer
> > > > > > >>> > > > > > > > >> > > > ID
> > > > > > >>> > > > > > > > >> > > > > as
> > > > > > >>> > > > > > > > >> > > > > > > > that
> > > > > > >>> > > > > > > > >> > > > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > what the client will
> > have.
> > > > If
> > > > > we
> > > > > > >>> > > complete
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > >> commit,
> > > > > > >>> > > > > > > > >> > and
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > transaction
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > state is reloaded, we
> > need
> > > > the
> > > > > > new
> > > > > > >>> > > > producer
> > > > > > >>> > > > > ID
> > > > > > >>> > > > > > > in
> > > > > > >>> > > > > > > > >> the
> > > > > > >>> > > > > > > > >> > > state
> > > > > > >>> > > > > > > > >> > > > > so
> > > > > > >>> > > > > > > > >> > > > > > > > there
> > > > > > >>> > > > > > > > >> > > > > > > > > > > isn't
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > an invalid producer ID
> > > > > mapping.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > The server downgrade
> > cases
> > > > are
> > > > > > >>> > > considering
> > > > > > >>> > > > > > > > >> > transactional
> > > > > > >>> > > > > > > > >> > > > > > > > correctness
> > > > > > >>> > > > > > > > >> > > > > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > not regressing from
> > > previous
> > > > > > >>> behavior
> > > > > > >>> > --
> > > > > > >>> > > > and
> > > > > > >>> > > > > > are
> > > > > > >>> > > > > > > > not
> > > > > > >>> > > > > > > > >> > > > > concerned
> > > > > > >>> > > > > > > > >> > > > > > > > about
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > supporting the safety
> > from
> > > > > > fencing
> > > > > > >>> > > retries
> > > > > > >>> > > > > (as
> > > > > > >>> > > > > > > we
> > > > > > >>> > > > > > > > >> have
> > > > > > >>> > > > > > > > >> > > > > > downgraded
> > > > > > >>> > > > > > > > >> > > > > > > > so
> > > > > > >>> > > > > > > > >> > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > don't need to
> support).
> > > > > Perhaps
> > > > > > >>> this
> > > > > > >>> > is
> > > > > > >>> > > a
> > > > > > >>> > > > > > trade
> > > > > > >>> > > > > > > > off,
> > > > > > >>> > > > > > > > >> > but
> > > > > > >>> > > > > > > > >> > > I
> > > > > > >>> > > > > > > > >> > > > > > think
> > > > > > >>> > > > > > > > >> > > > > > > it
> > > > > > >>> > > > > > > > >> > > > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > right one.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > (If the client
> > downgrades,
> > > > it
> > > > > > will
> > > > > > >>> > have
> > > > > > >>> > > > > > > restarted
> > > > > > >>> > > > > > > > >> and
> > > > > > >>> > > > > > > > >> > it
> > > > > > >>> > > > > > > > >> > > is
> > > > > > >>> > > > > > > > >> > > > > ok
> > > > > > >>> > > > > > > > >> > > > > > > for
> > > > > > >>> > > > > > > > >> > > > > > > > it
> > > > > > >>> > > > > > > > >> > > > > > > > > > to
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > have a new producer ID
> > > too).
> > > > > > >>> > > > > > > > >> > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > Justine
> > > > > > >>> > > > > > > > >> > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > On Fri, Jan 12, 2024
> at
> > > > > 11:42 AM
> > > > > > >>> Jun
> > > > > > >>> > Rao
> > > > > > >>> > > > > > > > >> > > > > > > <j...@confluent.io.invalid
> > > > > > >>> > > > > > > > >> > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > wrote:
> > > > > > >>> > > > > > > > >> > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > Hi, Justine,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > Thanks for the
> reply.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > 101.4 "If the marker
> > is
> > > > > > written
> > > > > > >>> by
> > > > > > >>> > the
> > > > > > >>> > > > new
> > > > > > >>> > > > > > > > >> client, we
> > > > > > >>> > > > > > > > >> > > can
> > > > > > >>> > > > > > > > >> > > > > as
> > > > > > >>> > > > > > > > >> > > > > > I
> > > > > > >>> > > > > > > > >> > > > > > > > > > > mentioned
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > the last email
> > guarantee
> > > > > that
> > > > > > >>> any
> > > > > > >>> > > EndTxn
> > > > > > >>> > > > > > > > requests
> > > > > > >>> > > > > > > > >> > with
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > same
> > > > > > >>> > > > > > > > >> > > > > > > > > epoch
> > > > > > >>> > > > > > > > >> > > > > > > > > > > are
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > from the same
> producer
> > > and
> > > > > the
> > > > > > >>> same
> > > > > > >>> > > > > > > transaction.
> > > > > > >>> > > > > > > > >> Then
> > > > > > >>> > > > > > > > >> > > we
> > > > > > >>> > > > > > > > >> > > > > > don't
> > > > > > >>> > > > > > > > >> > > > > > > > have
> > > > > > >>> > > > > > > > >> > > > > > > > > > to
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > return a fenced
> error
> > > but
> > > > > can
> > > > > > >>> handle
> > > > > > >>> > > > > > > gracefully
> > > > > > >>> > > > > > > > as
> > > > > > >>> > > > > > > > >> > > > > described
> > > > > > >>> > > > > > > > >> > > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > KIP."
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > When a delayed
> EndTnx
> > > > > request
> > > > > > is
> > > > > > >>> > > > > processed,
> > > > > > >>> > > > > > > the
> > > > > > >>> > > > > > > > >> txn
> > > > > > >>> > > > > > > > >> > > state
> > > > > > >>> > > > > > > > >> > > > > > could
> > > > > > >>> > > > > > > > >> > > > > > > > be
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > ongoing
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > for the next txn. I
> > > guess
> > > > in
> > > > > > >>> this
> > > > > > >>> > case
> > > > > > >>> > > > we
> > > > > > >>> > > > > > > still
> > > > > > >>> > > > > > > > >> > return
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > > fenced
> > > > > > >>> > > > > > > > >> > > > > > > > > > error
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > for
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > the delayed request?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > 102. Sorry, my
> > question
> > > > was
> > > > > > >>> > > inaccurate.
> > > > > > >>> > > > > What
> > > > > > >>> > > > > > > you
> > > > > > >>> > > > > > > > >> > > > described
> > > > > > >>> > > > > > > > >> > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > accurate.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > "The downgrade
> > > > > compatibility I
> > > > > > >>> > mention
> > > > > > >>> > > > is
> > > > > > >>> > > > > > that
> > > > > > >>> > > > > > > > we
> > > > > > >>> > > > > > > > >> > keep
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > same
> > > > > > >>> > > > > > > > >> > > > > > > > > > > producer
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > ID
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > and epoch in the
> main
> > > > > > >>> (non-tagged)
> > > > > > >>> > > > fields
> > > > > > >>> > > > > as
> > > > > > >>> > > > > > > we
> > > > > > >>> > > > > > > > >> did
> > > > > > >>> > > > > > > > >> > > > before
> > > > > > >>> > > > > > > > >> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > code
> > > > > > >>> > > > > > > > >> > > > > > > > > > on
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > server side." If we
> > want
> > > > to
> > > > > do
> > > > > > >>> this,
> > > > > > >>> > > it
> > > > > > >>> > > > > > seems
> > > > > > >>> > > > > > > > >> that we
> > > > > > >>> > > > > > > > >> > > > > should
> > > > > > >>> > > > > > > > >> > > > > > > use
> > > > > > >>> > > > > > > > >> > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > current produce Id
> and
> > > max
> > > > > > >>> epoch in
> > > > > > >>> > > the
> > > > > > >>> > > > > > > existing
> > > > > > >>> > > > > > > > >> > > > producerId
> > > > > > >>> > > > > > > > >> > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > producerEpoch fields
> > for
> > > > > both
> > > > > > >>> the
> > > > > > >>> > > > prepare
> > > > > > >>> > > > > > and
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > complete
> > > > > > >>> > > > > > > > >> > > > > > > > marker,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > right?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > The downgrade can
> > happen
> > > > > after
> > > > > > >>> the
> > > > > > >>> > > > > complete
> > > > > > >>> > > > > > > > >> marker is
> > > > > > >>> > > > > > > > >> > > > > > written.
> > > > > > >>> > > > > > > > >> > > > > > > > With
> > > > > > >>> > > > > > > > >> > > > > > > > > > > what
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > you described, the
> > > > > downgraded
> > > > > > >>> > > > coordinator
> > > > > > >>> > > > > > will
> > > > > > >>> > > > > > > > see
> > > > > > >>> > > > > > > > >> > the
> > > > > > >>> > > > > > > > >> > > > new
> > > > > > >>> > > > > > > > >> > > > > > > > produce
> > > > > > >>> > > > > > > > >> > > > > > > > > Id
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > instead of the old
> > one.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > Jun
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > On Fri, Jan 12, 2024
> > at
> > > > > > 10:44 AM
> > > > > > >>> > > Justine
> > > > > > >>> > > > > > > Olshan
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > >
> > > > > <jols...@confluent.io.invalid
> > > > > > >
> > > > > > >>> > wrote:
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > Hi Jun,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > I can update the
> > > > > > description.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > I believe your
> > second
> > > > > point
> > > > > > is
> > > > > > >>> > > > mentioned
> > > > > > >>> > > > > > in
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> > KIP.
> > > > > > >>> > > > > > > > >> > > I
> > > > > > >>> > > > > > > > >> > > > > can
> > > > > > >>> > > > > > > > >> > > > > > > add
> > > > > > >>> > > > > > > > >> > > > > > > > > more
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > text
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > on
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > this if it is
> > helpful.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > The delayed
> > message
> > > > case
> > > > > > can
> > > > > > >>> > also
> > > > > > >>> > > > > > violate
> > > > > > >>> > > > > > > > EOS
> > > > > > >>> > > > > > > > >> if
> > > > > > >>> > > > > > > > >> > > the
> > > > > > >>> > > > > > > > >> > > > > > > delayed
> > > > > > >>> > > > > > > > >> > > > > > > > > > > message
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > comes in after the
> > > next
> > > > > > >>> > > > > addPartitionsToTxn
> > > > > > >>> > > > > > > > >> request
> > > > > > >>> > > > > > > > >> > > > comes
> > > > > > >>> > > > > > > > >> > > > > > in.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > Effectively
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > may see a message
> > > from a
> > > > > > >>> previous
> > > > > > >>> > > > > > (aborted)
> > > > > > >>> > > > > > > > >> > > transaction
> > > > > > >>> > > > > > > > >> > > > > > > become
> > > > > > >>> > > > > > > > >> > > > > > > > > part
> > > > > > >>> > > > > > > > >> > > > > > > > > > > of
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > next transaction.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > If the marker is
> > > written
> > > > > by
> > > > > > >>> the
> > > > > > >>> > new
> > > > > > >>> > > > > > client,
> > > > > > >>> > > > > > > we
> > > > > > >>> > > > > > > > >> can
> > > > > > >>> > > > > > > > >> > > as I
> > > > > > >>> > > > > > > > >> > > > > > > > mentioned
> > > > > > >>> > > > > > > > >> > > > > > > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > last email
> guarantee
> > > > that
> > > > > > any
> > > > > > >>> > EndTxn
> > > > > > >>> > > > > > > requests
> > > > > > >>> > > > > > > > >> with
> > > > > > >>> > > > > > > > >> > > the
> > > > > > >>> > > > > > > > >> > > > > same
> > > > > > >>> > > > > > > > >> > > > > > > > epoch
> > > > > > >>> > > > > > > > >> > > > > > > > > > are
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > from
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > the same producer
> > and
> > > > the
> > > > > > same
> > > > > > >>> > > > > > transaction.
> > > > > > >>> > > > > > > > >> Then we
> > > > > > >>> > > > > > > > >> > > > don't
> > > > > > >>> > > > > > > > >> > > > > > > have
> > > > > > >>> > > > > > > > >> > > > > > > > to
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > return
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > a
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > fenced error but
> can
> > > > > handle
> > > > > > >>> > > gracefully
> > > > > > >>> > > > > as
> > > > > > >>> > > > > > > > >> described
> > > > > > >>> > > > > > > > >> > > in
> > > > > > >>> > > > > > > > >> > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > KIP.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > I don't think a
> > > boolean
> > > > is
> > > > > > >>> useful
> > > > > > >>> > > > since
> > > > > > >>> > > > > it
> > > > > > >>> > > > > > > is
> > > > > > >>> > > > > > > > >> > > directly
> > > > > > >>> > > > > > > > >> > > > > > > encoded
> > > > > > >>> > > > > > > > >> > > > > > > > by
> > > > > > >>> > > > > > > > >> > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > existence or lack
> of
> > > the
> > > > > > >>> tagged
> > > > > > >>> > > field
> > > > > > >>> > > > > > being
> > > > > > >>> > > > > > > > >> > written.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > In the prepare
> > marker
> > > we
> > > > > > will
> > > > > > >>> have
> > > > > > >>> > > the
> > > > > > >>> > > > > > same
> > > > > > >>> > > > > > > > >> > producer
> > > > > > >>> > > > > > > > >> > > ID
> > > > > > >>> > > > > > > > >> > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > non-tagged
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > field. In the
> > Complete
> > > > > state
> > > > > > >>> we
> > > > > > >>> > may
> > > > > > >>> > > > not.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > I'm not sure why
> the
> > > > > ongoing
> > > > > > >>> state
> > > > > > >>> > > > > matters
> > > > > > >>> > > > > > > for
> > > > > > >>> > > > > > > > >> this
> > > > > > >>> > > > > > > > >> > > > KIP.
> > > > > > >>> > > > > > > > >> > > > > It
> > > > > > >>> > > > > > > > >> > > > > > > > does
> > > > > > >>> > > > > > > > >> > > > > > > > > > > matter
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > for
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > KIP-939.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > I'm not sure what
> > you
> > > > are
> > > > > > >>> > referring
> > > > > > >>> > > to
> > > > > > >>> > > > > > about
> > > > > > >>> > > > > > > > >> > writing
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > previous
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > producer
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > ID in the prepare
> > > > marker.
> > > > > > >>> This is
> > > > > > >>> > > not
> > > > > > >>> > > > in
> > > > > > >>> > > > > > the
> > > > > > >>> > > > > > > > >> KIP.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > In the overflow
> > case,
> > > we
> > > > > > >>> write the
> > > > > > >>> > > > > > > > >> nextProducerId
> > > > > > >>> > > > > > > > >> > in
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > prepare
> > > > > > >>> > > > > > > > >> > > > > > > > > > > state.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > This is so we know
> > > what
> > > > we
> > > > > > >>> > assigned
> > > > > > >>> > > > when
> > > > > > >>> > > > > > we
> > > > > > >>> > > > > > > > >> reload
> > > > > > >>> > > > > > > > >> > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > transaction
> > > > > > >>> > > > > > > > >> > > > > > > > > > > log.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > Once we complete,
> we
> > > > > > >>> transition
> > > > > > >>> > this
> > > > > > >>> > > > ID
> > > > > > >>> > > > > to
> > > > > > >>> > > > > > > the
> > > > > > >>> > > > > > > > >> main
> > > > > > >>> > > > > > > > >> > > > > > > (non-tagged
> > > > > > >>> > > > > > > > >> > > > > > > > > > > field)
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > have the previous
> > > > producer
> > > > > > ID
> > > > > > >>> > field
> > > > > > >>> > > > > filled
> > > > > > >>> > > > > > > in.
> > > > > > >>> > > > > > > > >> This
> > > > > > >>> > > > > > > > >> > > is
> > > > > > >>> > > > > > > > >> > > > so
> > > > > > >>> > > > > > > > >> > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > can
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > identify
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > in a retry case
> the
> > > > > > operation
> > > > > > >>> > > > completed
> > > > > > >>> > > > > > > > >> > successfully
> > > > > > >>> > > > > > > > >> > > > and
> > > > > > >>> > > > > > > > >> > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > don't
> > > > > > >>> > > > > > > > >> > > > > > > > > > > fence
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > our
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > producer. The
> > > downgrade
> > > > > > >>> > > compatibility
> > > > > > >>> > > > I
> > > > > > >>> > > > > > > > mention
> > > > > > >>> > > > > > > > >> is
> > > > > > >>> > > > > > > > >> > > that
> > > > > > >>> > > > > > > > >> > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > keep
> > > > > > >>> > > > > > > > >> > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > same
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > producer ID and
> > epoch
> > > in
> > > > > the
> > > > > > >>> main
> > > > > > >>> > > > > > > (non-tagged)
> > > > > > >>> > > > > > > > >> > fields
> > > > > > >>> > > > > > > > >> > > > as
> > > > > > >>> > > > > > > > >> > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > did
> > > > > > >>> > > > > > > > >> > > > > > > > > > > before
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > code on the server
> > > side.
> > > > > If
> > > > > > >>> the
> > > > > > >>> > > server
> > > > > > >>> > > > > > > > >> downgrades,
> > > > > > >>> > > > > > > > >> > we
> > > > > > >>> > > > > > > > >> > > > are
> > > > > > >>> > > > > > > > >> > > > > > > still
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > compatible.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > This addresses
> both
> > > the
> > > > > > >>> prepare
> > > > > > >>> > and
> > > > > > >>> > > > > > complete
> > > > > > >>> > > > > > > > >> state
> > > > > > >>> > > > > > > > >> > > > > > > downgrades.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > Justine
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > On Fri, Jan 12,
> 2024
> > > at
> > > > > > >>> 10:21 AM
> > > > > > >>> > Jun
> > > > > > >>> > > > Rao
> > > > > > >>> > > > > > > > >> > > > > > > > > <j...@confluent.io.invalid
> > > > > > >>> > > > > > > > >> > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > wrote:
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > Hi, Justine,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > Thanks for the
> > > reply.
> > > > > > Sorry
> > > > > > >>> for
> > > > > > >>> > > the
> > > > > > >>> > > > > > > delay. I
> > > > > > >>> > > > > > > > >> > have a
> > > > > > >>> > > > > > > > >> > > > few
> > > > > > >>> > > > > > > > >> > > > > > > more
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > comments.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > 110. I think the
> > > > > > motivation
> > > > > > >>> > > section
> > > > > > >>> > > > > > could
> > > > > > >>> > > > > > > be
> > > > > > >>> > > > > > > > >> > > > improved.
> > > > > > >>> > > > > > > > >> > > > > > One
> > > > > > >>> > > > > > > > >> > > > > > > of
> > > > > > >>> > > > > > > > >> > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > motivations
> listed
> > > by
> > > > > the
> > > > > > >>> KIP is
> > > > > > >>> > > > "This
> > > > > > >>> > > > > > can
> > > > > > >>> > > > > > > > >> happen
> > > > > > >>> > > > > > > > >> > > > when
> > > > > > >>> > > > > > > > >> > > > > a
> > > > > > >>> > > > > > > > >> > > > > > > > > message
> > > > > > >>> > > > > > > > >> > > > > > > > > > > gets
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > stuck
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > or delayed due
> to
> > > > > > networking
> > > > > > >>> > > issues
> > > > > > >>> > > > > or a
> > > > > > >>> > > > > > > > >> network
> > > > > > >>> > > > > > > > >> > > > > > partition,
> > > > > > >>> > > > > > > > >> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > transaction
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > aborts, and then
> > the
> > > > > > delayed
> > > > > > >>> > > message
> > > > > > >>> > > > > > > finally
> > > > > > >>> > > > > > > > >> > comes
> > > > > > >>> > > > > > > > >> > > > > in.".
> > > > > > >>> > > > > > > > >> > > > > > > This
> > > > > > >>> > > > > > > > >> > > > > > > > > > seems
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > not
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > very accurate.
> > > Without
> > > > > > >>> KIP-890,
> > > > > > >>> > > > > > currently,
> > > > > > >>> > > > > > > > if
> > > > > > >>> > > > > > > > >> the
> > > > > > >>> > > > > > > > >> > > > > > > coordinator
> > > > > > >>> > > > > > > > >> > > > > > > > > > times
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > out
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > aborts an
> ongoing
> > > > > > >>> transaction,
> > > > > > >>> > it
> > > > > > >>> > > > > > already
> > > > > > >>> > > > > > > > >> bumps
> > > > > > >>> > > > > > > > >> > up
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > > epoch
> > > > > > >>> > > > > > > > >> > > > > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > marker,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > which prevents
> the
> > > > > delayed
> > > > > > >>> > produce
> > > > > > >>> > > > > > message
> > > > > > >>> > > > > > > > >> from
> > > > > > >>> > > > > > > > >> > > being
> > > > > > >>> > > > > > > > >> > > > > > added
> > > > > > >>> > > > > > > > >> > > > > > > > to
> > > > > > >>> > > > > > > > >> > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > user
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > partition. What
> > can
> > > > > cause
> > > > > > a
> > > > > > >>> > > hanging
> > > > > > >>> > > > > > > > >> transaction
> > > > > > >>> > > > > > > > >> > is
> > > > > > >>> > > > > > > > >> > > > that
> > > > > > >>> > > > > > > > >> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > producer
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > completes
> (either
> > > > aborts
> > > > > > or
> > > > > > >>> > > > commits) a
> > > > > > >>> > > > > > > > >> > transaction
> > > > > > >>> > > > > > > > >> > > > > before
> > > > > > >>> > > > > > > > >> > > > > > > > > > > receiving a
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > successful ack
> on
> > > > > messages
> > > > > > >>> > > published
> > > > > > >>> > > > > in
> > > > > > >>> > > > > > > the
> > > > > > >>> > > > > > > > >> same
> > > > > > >>> > > > > > > > >> > > txn.
> > > > > > >>> > > > > > > > >> > > > > In
> > > > > > >>> > > > > > > > >> > > > > > > this
> > > > > > >>> > > > > > > > >> > > > > > > > > > case,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > it's
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > possible for the
> > > > delayed
> > > > > > >>> message
> > > > > > >>> > > to
> > > > > > >>> > > > be
> > > > > > >>> > > > > > > > >> appended
> > > > > > >>> > > > > > > > >> > to
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > partition
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > after
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > marker, causing
> a
> > > > > > >>> transaction to
> > > > > > >>> > > > hang.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > A similar issue
> > (not
> > > > > > >>> mentioned
> > > > > > >>> > in
> > > > > > >>> > > > the
> > > > > > >>> > > > > > > > >> motivation)
> > > > > > >>> > > > > > > > >> > > > could
> > > > > > >>> > > > > > > > >> > > > > > > > happen
> > > > > > >>> > > > > > > > >> > > > > > > > > on
> > > > > > >>> > > > > > > > >> > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > marker in the
> > > > > > coordinator's
> > > > > > >>> log.
> > > > > > >>> > > For
> > > > > > >>> > > > > > > > example,
> > > > > > >>> > > > > > > > >> > it's
> > > > > > >>> > > > > > > > >> > > > > > possible
> > > > > > >>> > > > > > > > >> > > > > > > > for
> > > > > > >>> > > > > > > > >> > > > > > > > > > an
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > EndTxnRequest to
> > be
> > > > > > delayed
> > > > > > >>> on
> > > > > > >>> > the
> > > > > > >>> > > > > > > > >> coordinator.
> > > > > > >>> > > > > > > > >> > By
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > time
> > > > > > >>> > > > > > > > >> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > delayed
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > EndTxnRequest is
> > > > > > processed,
> > > > > > >>> it's
> > > > > > >>> > > > > > possible
> > > > > > >>> > > > > > > > that
> > > > > > >>> > > > > > > > >> > the
> > > > > > >>> > > > > > > > >> > > > > > previous
> > > > > > >>> > > > > > > > >> > > > > > > > txn
> > > > > > >>> > > > > > > > >> > > > > > > > > > has
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > already
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > completed and a
> > new
> > > > txn
> > > > > > has
> > > > > > >>> > > started.
> > > > > > >>> > > > > > > > >> Currently,
> > > > > > >>> > > > > > > > >> > > since
> > > > > > >>> > > > > > > > >> > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > epoch
> > > > > > >>> > > > > > > > >> > > > > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > not
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > bumped on every
> > txn,
> > > > the
> > > > > > >>> delayed
> > > > > > >>> > > > > > > > EndTxnRequest
> > > > > > >>> > > > > > > > >> > will
> > > > > > >>> > > > > > > > >> > > > add
> > > > > > >>> > > > > > > > >> > > > > > an
> > > > > > >>> > > > > > > > >> > > > > > > > > > > unexpected
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > prepare marker
> > (and
> > > > > > >>> eventually a
> > > > > > >>> > > > > > complete
> > > > > > >>> > > > > > > > >> marker)
> > > > > > >>> > > > > > > > >> > > to
> > > > > > >>> > > > > > > > >> > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > ongoing
> > > > > > >>> > > > > > > > >> > > > > > > > > > > txn.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > This
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > won't cause the
> > > > > > transaction
> > > > > > >>> to
> > > > > > >>> > > hang,
> > > > > > >>> > > > > but
> > > > > > >>> > > > > > > it
> > > > > > >>> > > > > > > > >> will
> > > > > > >>> > > > > > > > >> > > > break
> > > > > > >>> > > > > > > > >> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > EoS
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > semantic.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > The proposal in
> > this
> > > > KIP
> > > > > > >>> will
> > > > > > >>> > > > address
> > > > > > >>> > > > > > this
> > > > > > >>> > > > > > > > >> issue
> > > > > > >>> > > > > > > > >> > > too.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > 101. "However, I
> > was
> > > > > > >>> writing it
> > > > > > >>> > so
> > > > > > >>> > > > > that
> > > > > > >>> > > > > > we
> > > > > > >>> > > > > > > > can
> > > > > > >>> > > > > > > > >> > > > > > distinguish
> > > > > > >>> > > > > > > > >> > > > > > > > > > between
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > old clients
> where
> > we
> > > > > don't
> > > > > > >>> have
> > > > > > >>> > > the
> > > > > > >>> > > > > > > ability
> > > > > > >>> > > > > > > > do
> > > > > > >>> > > > > > > > >> > this
> > > > > > >>> > > > > > > > >> > > > > > > operation
> > > > > > >>> > > > > > > > >> > > > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > > new
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > clients that
> can.
> > > (Old
> > > > > > >>> clients
> > > > > > >>> > > don't
> > > > > > >>> > > > > > bump
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> > epoch
> > > > > > >>> > > > > > > > >> > > > on
> > > > > > >>> > > > > > > > >> > > > > > > > commit,
> > > > > > >>> > > > > > > > >> > > > > > > > > so
> > > > > > >>> > > > > > > > >> > > > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > can't
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > say for sure the
> > > write
> > > > > > >>> belongs
> > > > > > >>> > to
> > > > > > >>> > > > the
> > > > > > >>> > > > > > > given
> > > > > > >>> > > > > > > > >> > > > > > transaction)."
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > 101.1 I am
> > wondering
> > > > why
> > > > > > we
> > > > > > >>> need
> > > > > > >>> > > to
> > > > > > >>> > > > > > > > >> distinguish
> > > > > > >>> > > > > > > > >> > > > whether
> > > > > > >>> > > > > > > > >> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > marker
> > > > > > >>> > > > > > > > >> > > > > > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > written by the
> old
> > > and
> > > > > the
> > > > > > >>> new
> > > > > > >>> > > > client.
> > > > > > >>> > > > > > > Could
> > > > > > >>> > > > > > > > >> you
> > > > > > >>> > > > > > > > >> > > > > describe
> > > > > > >>> > > > > > > > >> > > > > > > > what
> > > > > > >>> > > > > > > > >> > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > do
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > differently if
> we
> > > know
> > > > > the
> > > > > > >>> > marker
> > > > > > >>> > > is
> > > > > > >>> > > > > > > written
> > > > > > >>> > > > > > > > >> by
> > > > > > >>> > > > > > > > >> > the
> > > > > > >>> > > > > > > > >> > > > new
> > > > > > >>> > > > > > > > >> > > > > > > > client?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > 101.2 If we do
> > need
> > > a
> > > > > way
> > > > > > to
> > > > > > >>> > > > > distinguish
> > > > > > >>> > > > > > > > >> whether
> > > > > > >>> > > > > > > > >> > > the
> > > > > > >>> > > > > > > > >> > > > > > marker
> > > > > > >>> > > > > > > > >> > > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > written
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > by
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > the old and the
> > new
> > > > > > client.
> > > > > > >>> > Would
> > > > > > >>> > > it
> > > > > > >>> > > > > be
> > > > > > >>> > > > > > > > >> simpler
> > > > > > >>> > > > > > > > >> > to
> > > > > > >>> > > > > > > > >> > > > just
> > > > > > >>> > > > > > > > >> > > > > > > > > > introduce a
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > boolean
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > field instead of
> > > > > > indirectly
> > > > > > >>> > > through
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > >> previous
> > > > > > >>> > > > > > > > >> > > > > produce
> > > > > > >>> > > > > > > > >> > > > > > ID
> > > > > > >>> > > > > > > > >> > > > > > > > > > field?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > 101.3 It's not
> > clear
> > > > to
> > > > > me
> > > > > > >>> why
> > > > > > >>> > we
> > > > > > >>> > > > only
> > > > > > >>> > > > > > add
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > previous
> > > > > > >>> > > > > > > > >> > > > > > > > produce
> > > > > > >>> > > > > > > > >> > > > > > > > > > ID
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > field
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > the complete
> > marker,
> > > > but
> > > > > > >>> not in
> > > > > > >>> > > the
> > > > > > >>> > > > > > > prepare
> > > > > > >>> > > > > > > > >> > marker.
> > > > > > >>> > > > > > > > >> > > > If
> > > > > > >>> > > > > > > > >> > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > want
> > > > > > >>> > > > > > > > >> > > > > > > > > to
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > know
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > whether a marker
> > is
> > > > > > written
> > > > > > >>> by
> > > > > > >>> > the
> > > > > > >>> > > > new
> > > > > > >>> > > > > > > > client
> > > > > > >>> > > > > > > > >> or
> > > > > > >>> > > > > > > > >> > > not,
> > > > > > >>> > > > > > > > >> > > > > it
> > > > > > >>> > > > > > > > >> > > > > > > > seems
> > > > > > >>> > > > > > > > >> > > > > > > > > > that
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > want
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > to do this
> > > > consistently
> > > > > > for
> > > > > > >>> all
> > > > > > >>> > > > > markers.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > 101.4 What about
> > the
> > > > > > >>> > > > > TransactionLogValue
> > > > > > >>> > > > > > > > >> record
> > > > > > >>> > > > > > > > >> > > > > > > representing
> > > > > > >>> > > > > > > > >> > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > ongoing
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > state? Should we
> > > also
> > > > > > >>> > distinguish
> > > > > > >>> > > > > > whether
> > > > > > >>> > > > > > > > it's
> > > > > > >>> > > > > > > > >> > > > written
> > > > > > >>> > > > > > > > >> > > > > by
> > > > > > >>> > > > > > > > >> > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > old
> > > > > > >>> > > > > > > > >> > > > > > > > > > > or
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > new client?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > 102. In the
> > overflow
> > > > > case,
> > > > > > >>> it's
> > > > > > >>> > > > still
> > > > > > >>> > > > > > not
> > > > > > >>> > > > > > > > >> clear
> > > > > > >>> > > > > > > > >> > to
> > > > > > >>> > > > > > > > >> > > me
> > > > > > >>> > > > > > > > >> > > > > why
> > > > > > >>> > > > > > > > >> > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > write
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > previous produce
> > Id
> > > in
> > > > > the
> > > > > > >>> > prepare
> > > > > > >>> > > > > > marker
> > > > > > >>> > > > > > > > >> while
> > > > > > >>> > > > > > > > >> > > > writing
> > > > > > >>> > > > > > > > >> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > next
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > produce
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > Id
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > in the complete
> > > > marker.
> > > > > > You
> > > > > > >>> > > > mentioned
> > > > > > >>> > > > > > that
> > > > > > >>> > > > > > > > >> it's
> > > > > > >>> > > > > > > > >> > for
> > > > > > >>> > > > > > > > >> > > > > > > > > downgrading.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > However,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > we could
> downgrade
> > > > with
> > > > > > >>> either
> > > > > > >>> > the
> > > > > > >>> > > > > > prepare
> > > > > > >>> > > > > > > > >> marker
> > > > > > >>> > > > > > > > >> > > or
> > > > > > >>> > > > > > > > >> > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > complete
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > marker.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > In either case,
> > the
> > > > > > >>> downgraded
> > > > > > >>> > > > > > coordinator
> > > > > > >>> > > > > > > > >> should
> > > > > > >>> > > > > > > > >> > > see
> > > > > > >>> > > > > > > > >> > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > same
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > produce
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > id
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > (probably the
> > > previous
> > > > > > >>> produce
> > > > > > >>> > > Id),
> > > > > > >>> > > > > > right?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > Jun
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > On Wed, Dec 20,
> > 2023
> > > > at
> > > > > > >>> 6:00 PM
> > > > > > >>> > > > > Justine
> > > > > > >>> > > > > > > > Olshan
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > >>> <jols...@confluent.io.invalid>
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > wrote:
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > Hey Jun,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > Thanks for
> > taking
> > > a
> > > > > look
> > > > > > >>> at
> > > > > > >>> > the
> > > > > > >>> > > > KIP
> > > > > > >>> > > > > > > again.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > 100. For the
> > epoch
> > > > > > >>> overflow
> > > > > > >>> > > case,
> > > > > > >>> > > > > only
> > > > > > >>> > > > > > > the
> > > > > > >>> > > > > > > > >> > marker
> > > > > > >>> > > > > > > > >> > > > > will
> > > > > > >>> > > > > > > > >> > > > > > > have
> > > > > > >>> > > > > > > > >> > > > > > > > > max
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > epoch.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > This
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > keeps the
> > behavior
> > > > of
> > > > > > the
> > > > > > >>> rest
> > > > > > >>> > > of
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > >> markers
> > > > > > >>> > > > > > > > >> > > where
> > > > > > >>> > > > > > > > >> > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > last
> > > > > > >>> > > > > > > > >> > > > > > > > > > > marker
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > epoch of the
> > > > > transaction
> > > > > > >>> > > records +
> > > > > > >>> > > > > 1.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > 101. You are
> > > correct
> > > > > > that
> > > > > > >>> we
> > > > > > >>> > > don't
> > > > > > >>> > > > > > need
> > > > > > >>> > > > > > > to
> > > > > > >>> > > > > > > > >> > write
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > producer
> > > > > > >>> > > > > > > > >> > > > > > > > > > ID
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > since
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > it
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > is the same.
> > > > However,
> > > > > I
> > > > > > >>> was
> > > > > > >>> > > > writing
> > > > > > >>> > > > > it
> > > > > > >>> > > > > > > so
> > > > > > >>> > > > > > > > >> that
> > > > > > >>> > > > > > > > >> > we
> > > > > > >>> > > > > > > > >> > > > can
> > > > > > >>> > > > > > > > >> > > > > > > > > > distinguish
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > between
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > old clients
> > where
> > > we
> > > > > > don't
> > > > > > >>> > have
> > > > > > >>> > > > the
> > > > > > >>> > > > > > > > ability
> > > > > > >>> > > > > > > > >> do
> > > > > > >>> > > > > > > > >> > > this
> > > > > > >>> > > > > > > > >> > > > > > > > operation
> > > > > > >>> > > > > > > > >> > > > > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > new
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > clients that
> > can.
> > > > (Old
> > > > > > >>> clients
> > > > > > >>> > > > don't
> > > > > > >>> > > > > > > bump
> > > > > > >>> > > > > > > > >> the
> > > > > > >>> > > > > > > > >> > > epoch
> > > > > > >>> > > > > > > > >> > > > > on
> > > > > > >>> > > > > > > > >> > > > > > > > > commit,
> > > > > > >>> > > > > > > > >> > > > > > > > > > so
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > can't
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > say for sure
> the
> > > > write
> > > > > > >>> belongs
> > > > > > >>> > > to
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > given
> > > > > > >>> > > > > > > > >> > > > > > transaction).
> > > > > > >>> > > > > > > > >> > > > > > > > If
> > > > > > >>> > > > > > > > >> > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > receive
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > an
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > EndTxn request
> > > from
> > > > a
> > > > > > new
> > > > > > >>> > > client,
> > > > > > >>> > > > we
> > > > > > >>> > > > > > > will
> > > > > > >>> > > > > > > > >> fill
> > > > > > >>> > > > > > > > >> > > this
> > > > > > >>> > > > > > > > >> > > > > > > field.
> > > > > > >>> > > > > > > > >> > > > > > > > We
> > > > > > >>> > > > > > > > >> > > > > > > > > > can
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > guarantee
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > that any
> EndTxn
> > > > > requests
> > > > > > >>> with
> > > > > > >>> > > the
> > > > > > >>> > > > > same
> > > > > > >>> > > > > > > > epoch
> > > > > > >>> > > > > > > > >> > are
> > > > > > >>> > > > > > > > >> > > > from
> > > > > > >>> > > > > > > > >> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > same
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > producer
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > the same
> > > > transaction.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > 102. In
> prepare
> > > > phase,
> > > > > > we
> > > > > > >>> have
> > > > > > >>> > > the
> > > > > > >>> > > > > > same
> > > > > > >>> > > > > > > > >> > producer
> > > > > > >>> > > > > > > > >> > > ID
> > > > > > >>> > > > > > > > >> > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > epoch
> > > > > > >>> > > > > > > > >> > > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > always
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > had. It is the
> > > > > producer
> > > > > > >>> ID and
> > > > > > >>> > > > epoch
> > > > > > >>> > > > > > > that
> > > > > > >>> > > > > > > > >> are
> > > > > > >>> > > > > > > > >> > on
> > > > > > >>> > > > > > > > >> > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > marker.
> > > > > > >>> > > > > > > > >> > > > > > > > > In
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > commit
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > phase, we stay
> > the
> > > > > same
> > > > > > >>> unless
> > > > > > >>> > > it
> > > > > > >>> > > > is
> > > > > > >>> > > > > > the
> > > > > > >>> > > > > > > > >> > overflow
> > > > > > >>> > > > > > > > >> > > > > case.
> > > > > > >>> > > > > > > > >> > > > > > > In
> > > > > > >>> > > > > > > > >> > > > > > > > > that
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > case,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > set the
> producer
> > > ID
> > > > to
> > > > > > >>> the new
> > > > > > >>> > > one
> > > > > > >>> > > > > we
> > > > > > >>> > > > > > > > >> generated
> > > > > > >>> > > > > > > > >> > > and
> > > > > > >>> > > > > > > > >> > > > > > epoch
> > > > > > >>> > > > > > > > >> > > > > > > > to
> > > > > > >>> > > > > > > > >> > > > > > > > > 0
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > after
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > complete. This
> > is
> > > > for
> > > > > > >>> > downgrade
> > > > > > >>> > > > > > > > >> compatibility.
> > > > > > >>> > > > > > > > >> > > The
> > > > > > >>> > > > > > > > >> > > > > > tagged
> > > > > > >>> > > > > > > > >> > > > > > > > > > fields
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > are
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > just
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > safety guards
> > for
> > > > > > retries
> > > > > > >>> and
> > > > > > >>> > > > > > failovers.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > In prepare
> phase
> > > for
> > > > > > epoch
> > > > > > >>> > > > overflow
> > > > > > >>> > > > > > case
> > > > > > >>> > > > > > > > >> only
> > > > > > >>> > > > > > > > >> > we
> > > > > > >>> > > > > > > > >> > > > > store
> > > > > > >>> > > > > > > > >> > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > next
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > producer
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > ID. This is
> for
> > > the
> > > > > case
> > > > > > >>> where
> > > > > > >>> > > we
> > > > > > >>> > > > > > reload
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > transaction
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > coordinator
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > prepare state.
> > > Once
> > > > > the
> > > > > > >>> > > > transaction
> > > > > > >>> > > > > is
> > > > > > >>> > > > > > > > >> > committed,
> > > > > > >>> > > > > > > > >> > > > we
> > > > > > >>> > > > > > > > >> > > > > > can
> > > > > > >>> > > > > > > > >> > > > > > > > use
> > > > > > >>> > > > > > > > >> > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > producer
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > ID the client
> > > > already
> > > > > is
> > > > > > >>> > using.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > In commit
> phase,
> > > we
> > > > > > store
> > > > > > >>> the
> > > > > > >>> > > > > previous
> > > > > > >>> > > > > > > > >> producer
> > > > > > >>> > > > > > > > >> > > ID
> > > > > > >>> > > > > > > > >> > > > in
> > > > > > >>> > > > > > > > >> > > > > > > case
> > > > > > >>> > > > > > > > >> > > > > > > > of
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > retries.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > I think it is
> > > easier
> > > > > to
> > > > > > >>> think
> > > > > > >>> > of
> > > > > > >>> > > > it
> > > > > > >>> > > > > as
> > > > > > >>> > > > > > > > just
> > > > > > >>> > > > > > > > >> how
> > > > > > >>> > > > > > > > >> > > we
> > > > > > >>> > > > > > > > >> > > > > were
> > > > > > >>> > > > > > > > >> > > > > > > > > storing
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > producer
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > ID
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > and epoch
> > before,
> > > > with
> > > > > > >>> some
> > > > > > >>> > > extra
> > > > > > >>> > > > > > > > bookeeping
> > > > > > >>> > > > > > > > >> > and
> > > > > > >>> > > > > > > > >> > > > edge
> > > > > > >>> > > > > > > > >> > > > > > > case
> > > > > > >>> > > > > > > > >> > > > > > > > > > > handling
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > tagged fields.
> > We
> > > > have
> > > > > > to
> > > > > > >>> do
> > > > > > >>> > it
> > > > > > >>> > > > this
> > > > > > >>> > > > > > way
> > > > > > >>> > > > > > > > for
> > > > > > >>> > > > > > > > >> > > > > > > compatibility
> > > > > > >>> > > > > > > > >> > > > > > > > > with
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > downgrades.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > 103. Next
> > producer
> > > > ID
> > > > > is
> > > > > > >>> for
> > > > > > >>> > > > prepare
> > > > > > >>> > > > > > > > status
> > > > > > >>> > > > > > > > >> and
> > > > > > >>> > > > > > > > >> > > > > > previous
> > > > > > >>> > > > > > > > >> > > > > > > > > > producer
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > ID
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > for
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > after
> complete.
> > > The
> > > > > > >>> reason why
> > > > > > >>> > > we
> > > > > > >>> > > > > need
> > > > > > >>> > > > > > > two
> > > > > > >>> > > > > > > > >> > > separate
> > > > > > >>> > > > > > > > >> > > > > > > > (tagged)
> > > > > > >>> > > > > > > > >> > > > > > > > > > > fields
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > for
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > backwards
> > > > > compatibility.
> > > > > > >>> We
> > > > > > >>> > need
> > > > > > >>> > > > to
> > > > > > >>> > > > > > keep
> > > > > > >>> > > > > > > > the
> > > > > > >>> > > > > > > > >> > same
> > > > > > >>> > > > > > > > >> > > > > > > semantics
> > > > > > >>> > > > > > > > >> > > > > > > > > for
> > > > > > >>> > > > > > > > >> > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > non-tagged
> field
> > > in
> > > > > case
> > > > > > >>> we
> > > > > > >>> > > > > downgrade.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > 104. We set
> the
> > > > fields
> > > > > > as
> > > > > > >>> we
> > > > > > >>> > do
> > > > > > >>> > > in
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > >> > > > transactional
> > > > > > >>> > > > > > > > >> > > > > > > state
> > > > > > >>> > > > > > > > >> > > > > > > > > (as
> > > > > > >>> > > > > > > > >> > > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > need
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > to
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > do this for
> > > > > > compatibility
> > > > > > >>> --
> > > > > > >>> > if
> > > > > > >>> > > we
> > > > > > >>> > > > > > > > >> downgrade,
> > > > > > >>> > > > > > > > >> > we
> > > > > > >>> > > > > > > > >> > > > will
> > > > > > >>> > > > > > > > >> > > > > > > only
> > > > > > >>> > > > > > > > >> > > > > > > > > have
> > > > > > >>> > > > > > > > >> > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > non-tagged
> > fields)
> > > > It
> > > > > > >>> will be
> > > > > > >>> > > the
> > > > > > >>> > > > > old
> > > > > > >>> > > > > > > > >> producer
> > > > > > >>> > > > > > > > >> > ID
> > > > > > >>> > > > > > > > >> > > > and
> > > > > > >>> > > > > > > > >> > > > > > max
> > > > > > >>> > > > > > > > >> > > > > > > > > > epoch.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > Hope this
> helps.
> > > Let
> > > > > me
> > > > > > >>> know
> > > > > > >>> > if
> > > > > > >>> > > > you
> > > > > > >>> > > > > > have
> > > > > > >>> > > > > > > > >> > further
> > > > > > >>> > > > > > > > >> > > > > > > questions.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > Justine
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > On Wed, Dec
> 20,
> > > 2023
> > > > > at
> > > > > > >>> > 3:33 PM
> > > > > > >>> > > > Jun
> > > > > > >>> > > > > > Rao
> > > > > > >>> > > > > > > > >> > > > > > > > > > <j...@confluent.io.invalid
> > > > > > >>> > > > > > > > >> > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > wrote:
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > Hi, Justine,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > It seems
> that
> > > you
> > > > > have
> > > > > > >>> made
> > > > > > >>> > > some
> > > > > > >>> > > > > > > changes
> > > > > > >>> > > > > > > > >> to
> > > > > > >>> > > > > > > > >> > > > KIP-890
> > > > > > >>> > > > > > > > >> > > > > > > since
> > > > > > >>> > > > > > > > >> > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > vote.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > In
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > particular,
> we
> > > are
> > > > > > >>> changing
> > > > > > >>> > > the
> > > > > > >>> > > > > > format
> > > > > > >>> > > > > > > > of
> > > > > > >>> > > > > > > > >> > > > > > > > > > TransactionLogValue.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > A
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > few
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > comments
> > related
> > > > to
> > > > > > >>> that.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > 100. Just to
> > be
> > > > > clear.
> > > > > > >>> The
> > > > > > >>> > > > > overflow
> > > > > > >>> > > > > > > case
> > > > > > >>> > > > > > > > >> > (i.e.
> > > > > > >>> > > > > > > > >> > > > > when a
> > > > > > >>> > > > > > > > >> > > > > > > new
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > producerId
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > generated)
> is
> > > when
> > > > > the
> > > > > > >>> > current
> > > > > > >>> > > > > epoch
> > > > > > >>> > > > > > > > >> equals
> > > > > > >>> > > > > > > > >> > to
> > > > > > >>> > > > > > > > >> > > > max
> > > > > > >>> > > > > > > > >> > > > > -
> > > > > > >>> > > > > > > > >> > > > > > 1
> > > > > > >>> > > > > > > > >> > > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > not
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > max?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > 101. For the
> > > "not
> > > > > > epoch
> > > > > > >>> > > > overflow"
> > > > > > >>> > > > > > > case,
> > > > > > >>> > > > > > > > we
> > > > > > >>> > > > > > > > >> > > write
> > > > > > >>> > > > > > > > >> > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > previous
> > > > > > >>> > > > > > > > >> > > > > > > > > > > ID
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > tagged field
> > in
> > > > the
> > > > > > >>> complete
> > > > > > >>> > > > > phase.
> > > > > > >>> > > > > > Do
> > > > > > >>> > > > > > > > we
> > > > > > >>> > > > > > > > >> > need
> > > > > > >>> > > > > > > > >> > > to
> > > > > > >>> > > > > > > > >> > > > > do
> > > > > > >>> > > > > > > > >> > > > > > > that
> > > > > > >>> > > > > > > > >> > > > > > > > > > since
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > produce
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > id
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > doesn't
> change
> > > in
> > > > > this
> > > > > > >>> case?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > 102. It
> seems
> > > that
> > > > > the
> > > > > > >>> > meaning
> > > > > > >>> > > > for
> > > > > > >>> > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > ProducerId/ProducerEpoch
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > fields
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > TransactionLogValue
> > > > > > >>> changes
> > > > > > >>> > > > > > depending
> > > > > > >>> > > > > > > on
> > > > > > >>> > > > > > > > >> the
> > > > > > >>> > > > > > > > >> > > > > > > > > > TransactionStatus.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > When
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > the
> > > > > TransactionStatus
> > > > > > is
> > > > > > >>> > > > ongoing,
> > > > > > >>> > > > > > they
> > > > > > >>> > > > > > > > >> > > represent
> > > > > > >>> > > > > > > > >> > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > current
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > ProducerId
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > the current
> > > > > > >>> ProducerEpoch.
> > > > > > >>> > > When
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > >> > > > > TransactionStatus
> > > > > > >>> > > > > > > > >> > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > >>> PrepareCommit/PrepareAbort,
> > > > > > >>> > > they
> > > > > > >>> > > > > > > > represent
> > > > > > >>> > > > > > > > >> > the
> > > > > > >>> > > > > > > > >> > > > > > current
> > > > > > >>> > > > > > > > >> > > > > > > > > > > ProducerId
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > next
> > > > ProducerEpoch.
> > > > > > >>> When the
> > > > > > >>> > > > > > > > >> > TransactionStatus
> > > > > > >>> > > > > > > > >> > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > Commit/Abort,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > they
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > further
> depend
> > > on
> > > > > > >>> whether
> > > > > > >>> > the
> > > > > > >>> > > > > epoch
> > > > > > >>> > > > > > > > >> overflows
> > > > > > >>> > > > > > > > >> > > or
> > > > > > >>> > > > > > > > >> > > > > not.
> > > > > > >>> > > > > > > > >> > > > > > > If
> > > > > > >>> > > > > > > > >> > > > > > > > > > there
> > > > > > >>> > > > > > > > >> > > > > > > > > > > is
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > no
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > overflow,
> they
> > > > > > represent
> > > > > > >>> > the
> > > > > > >>> > > > > > current
> > > > > > >>> > > > > > > > >> > > ProducerId
> > > > > > >>> > > > > > > > >> > > > > and
> > > > > > >>> > > > > > > > >> > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > next
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > ProducerEpoch
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > (max).
> > > Otherwise,
> > > > > they
> > > > > > >>> > > represent
> > > > > > >>> > > > > the
> > > > > > >>> > > > > > > > newly
> > > > > > >>> > > > > > > > >> > > > > generated
> > > > > > >>> > > > > > > > >> > > > > > > > > > ProducerId
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > and a
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> ProducerEpoch
> > of
> > > > 0.
> > > > > Is
> > > > > > >>> that
> > > > > > >>> > > > right?
> > > > > > >>> > > > > > > This
> > > > > > >>> > > > > > > > >> seems
> > > > > > >>> > > > > > > > >> > > not
> > > > > > >>> > > > > > > > >> > > > > > easy
> > > > > > >>> > > > > > > > >> > > > > > > to
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > understand.
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > Could
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > we provide
> > some
> > > > > > examples
> > > > > > >>> > like
> > > > > > >>> > > > what
> > > > > > >>> > > > > > > Artem
> > > > > > >>> > > > > > > > >> has
> > > > > > >>> > > > > > > > >> > > done
> > > > > > >>> > > > > > > > >> > > > > in
> > > > > > >>> > > > > > > > >> > > > > > > > > KIP-939?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > Have
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > we
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > considered a
> > > > simpler
> > > > > > >>> design
> > > > > > >>> > > > where
> > > > > > >>> > > > > > > > >> > > > > > > > ProducerId/ProducerEpoch
> > > > > > >>> > > > > > > > >> > > > > > > > > > > always
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > represent
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > the same
> value
> > > > (e.g.
> > > > > > >>> for the
> > > > > > >>> > > > > current
> > > > > > >>> > > > > > > > >> > > transaction)
> > > > > > >>> > > > > > > > >> > > > > > > > > independent
> > > > > > >>> > > > > > > > >> > > > > > > > > > > of
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > the
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > TransactionStatus
> > > > > and
> > > > > > >>> epoch
> > > > > > >>> > > > > > overflow?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > 103. It's
> not
> > > > clear
> > > > > to
> > > > > > >>> me
> > > > > > >>> > why
> > > > > > >>> > > we
> > > > > > >>> > > > > > need
> > > > > > >>> > > > > > > 3
> > > > > > >>> > > > > > > > >> > fields:
> > > > > > >>> > > > > > > > >> > > > > > > > ProducerId,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> PrevProducerId,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > NextProducerId.
> > > > > Could
> > > > > > we
> > > > > > >>> > just
> > > > > > >>> > > > have
> > > > > > >>> > > > > > > > >> ProducerId
> > > > > > >>> > > > > > > > >> > > and
> > > > > > >>> > > > > > > > >> > > > > > > > > > > NextProducerId?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > 104. For
> > > > > > >>> > > WriteTxnMarkerRequests,
> > > > > > >>> > > > > if
> > > > > > >>> > > > > > > the
> > > > > > >>> > > > > > > > >> > > producer
> > > > > > >>> > > > > > > > >> > > > > > epoch
> > > > > > >>> > > > > > > > >> > > > > > > > > > > overflows,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > what
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > do
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > we set the
> > > > > producerId
> > > > > > >>> and
> > > > > > >>> > the
> > > > > > >>> > > > > > > > >> producerEpoch?
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > Thanks,
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > Jun
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > > >
> > > > > > >>> > > > > > > > >> > > > > > >
> > > > > > >>> > > > > > > > >> > > > > >
> > > > > > >>> > > > > > > > >> > > > >
> > > > > > >>> > > > > > > > >> > > >
> > > > > > >>> > > > > > > > >> > >
> > > > > > >>> > > > > > > > >> >
> > > > > > >>> > > > > > > > >>
> > > > > > >>> > > > > > > > >
> > > > > > >>> > > > > > > >
> > > > > > >>> > > > > > >
> > > > > > >>> > > > > >
> > > > > > >>> > > > >
> > > > > > >>> > > >
> > > > > > >>> > >
> > > > > > >>> >
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to