Hi Jun, 101.3 I can change "last seen" to "current producer id and epoch" if that was the part that was confusing 110 I can mention this 111 I can do that 112 We still need it. But I am still finalizing the design. I will update the KIP once I get the information finalized. Sorry for the delays.
Justine On Fri, Jan 19, 2024 at 10:50 AM Jun Rao <j...@confluent.io.invalid> wrote: > Hi, Justine, > > Thanks for the reply. > > 101.3 In the non-overflow case, the previous ID is the same as the produce > ID for the complete marker too, but we set the previous ID in the complete > marker. Earlier you mentioned that this is to know that the marker is > written by the new client so that we could return success on retried > endMarker requests. I was trying to understand why this is not needed for > the prepare marker since retry can happen in the prepare state too. Is the > reason that in the prepare state, we return CONCURRENT_TRANSACTIONS instead > of success on retried endMaker requests? If so, should we change "If we > retry and see epoch - 1 + ID in last seen fields and are issuing the same > command (ie commit not abort) we can return (with the new epoch)" > accordingly? > > 110. Yes, without this KIP, a delayed endMaker request carries the same > epoch and won't be fenced. This can commit/abort a future transaction > unexpectedly. I am not sure if we have seen this in practice though. > > 111. Sounds good. It would be useful to make it clear that we can now > populate the lastSeen field from the log reliably. > > 112. Yes, I was referring to AddPartitionsToTxnRequest since it's called > across brokers and we are changing its schema. Are you saying we don't need > it any more? I thought that we already implemented the server side > verification logic based on AddPartitionsToTxnRequest across brokers. > > Jun > > > On Thu, Jan 18, 2024 at 5:05 PM Justine Olshan > <jols...@confluent.io.invalid> > wrote: > > > Hey Jun, > > > > 101.3 We don't set the previous ID in the Prepare field since we don't > need > > it. It is the same producer ID as the main producer ID field. > > > > 110 Hmm -- maybe I need to reread your message about delayed markers. If > we > > receive a delayed endTxn marker after the transaction is already > complete? > > So we will commit the next transaction early without the fixes in part 2? > > > > 111 Yes -- this terminology was used in a previous KIP and never > > implemented it in the log -- only in memory > > > > 112 Hmm -- which interbroker protocol are you referring to? I am working > on > > the design for the work to remove the extra add partitions call and I > right > > now the design bumps MV. I have yet to update that section as I finalize > > the design so please stay tuned. Was there anything else you thought > needed > > MV bump? > > > > Justine > > > > On Thu, Jan 18, 2024 at 3:07 PM Jun Rao <j...@confluent.io.invalid> > wrote: > > > > > Hi, Justine, > > > > > > I don't see this create any issue. It just makes it a bit hard to > explain > > > what this non-tagged produce id field means. We are essentially trying > to > > > combine two actions (completing a txn and init a new produce Id) in a > > > single record. But, this may be fine too. > > > > > > A few other follow up comments. > > > > > > 101.3 I guess the reason that we only set the previous produce id > tagged > > > field in the complete marker, but not in the prepare marker, is that in > > the > > > prepare state, we always return CONCURRENT_TRANSACTIONS on retried > > endMaker > > > requests? > > > > > > 110. "I believe your second point is mentioned in the KIP. I can add > more > > > text on > > > this if it is helpful. > > > > The delayed message case can also violate EOS if the delayed message > > > comes in after the next addPartitionsToTxn request comes in. > Effectively > > we > > > may see a message from a previous (aborted) transaction become part of > > the > > > next transaction." > > > > > > The above is the case when a delayed message is appended to the data > > > partition. What I mentioned is a slightly different case when a delayed > > > marker is appended to the transaction log partition. > > > > > > 111. The KIP says "Once we move past the Prepare and Complete states, > we > > > don’t need to worry about lastSeen fields and clear them, just handle > > state > > > transitions as normal.". Is the lastSeen field the same as the previous > > > Produce Id tagged field in TransactionLogValue? > > > > > > 112. Since the kip changes the inter-broker protocol, should we bump up > > the > > > MV/IBP version? Is this feature only for the KRaft mode? > > > > > > Thanks, > > > > > > Jun > > > > > > > > > On Wed, Jan 17, 2024 at 11:13 AM Justine Olshan > > > <jols...@confluent.io.invalid> wrote: > > > > > > > Hey Jun, > > > > > > > > I'm glad we are getting to convergence on the design. :) > > > > > > > > While I understand it seems a little "weird". I'm not sure what the > > > benefit > > > > of writing an extra record to the log. > > > > Is the concern a tool to describe transactions won't work (ie, the > > > complete > > > > state is needed to calculate the time since the transaction > completed?) > > > > If we have a reason like this, it is enough to convince me we need > such > > > an > > > > extra record. It seems like it would be replacing the record written > on > > > > InitProducerId. Is this correct? > > > > > > > > Thanks, > > > > Justine > > > > > > > > On Tue, Jan 16, 2024 at 5:14 PM Jun Rao <j...@confluent.io.invalid> > > > wrote: > > > > > > > > > Hi, Justine, > > > > > > > > > > Thanks for the explanation. I understand the intention now. In the > > > > overflow > > > > > case, we set the non-tagged field to the old pid (and the max > epoch) > > in > > > > the > > > > > prepare marker so that we could correctly write the marker to the > > data > > > > > partition if the broker downgrades. When writing the complete > marker, > > > we > > > > > know the marker has already been written to the data partition. We > > set > > > > the > > > > > non-tagged field to the new pid to avoid InvalidPidMappingException > > in > > > > the > > > > > client if the broker downgrades. > > > > > > > > > > The above seems to work. It's just a bit inconsistent for a prepare > > > > marker > > > > > and a complete marker to use different pids in this special case. > If > > we > > > > > downgrade with the complete marker, it seems that we will never be > > able > > > > to > > > > > write the complete marker with the old pid. Not sure if it causes > any > > > > > issue, but it seems a bit weird. Instead of writing the complete > > marker > > > > > with the new pid, could we write two records: a complete marker > with > > > the > > > > > old pid followed by a TransactionLogValue with the new pid and an > > empty > > > > > state? We could make the two records in the same batch so that they > > > will > > > > be > > > > > added to the log atomically. > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > > > > > > On Fri, Jan 12, 2024 at 5:40 PM Justine Olshan > > > > > <jols...@confluent.io.invalid> > > > > > wrote: > > > > > > > > > > > (1) the prepare marker is written, but the endTxn response is not > > > > > received > > > > > > by the client when the server downgrades > > > > > > (2) the prepare marker is written, the endTxn response is > received > > > by > > > > > the > > > > > > client when the server downgrades. > > > > > > > > > > > > I think I am still a little confused. In both of these cases, the > > > > > > transaction log has the old producer ID. We don't write the new > > > > producer > > > > > ID > > > > > > in the prepare marker's non tagged fields. > > > > > > If the server downgrades now, it would read the records not in > > tagged > > > > > > fields and the complete marker will also have the old producer > ID. > > > > > > (If we had used the new producer ID, we would not have > > transactional > > > > > > correctness since the producer id doesn't match the transaction > and > > > the > > > > > > state would not be correct on the data partition.) > > > > > > > > > > > > In the overflow case, I'd expect the following to happen on the > > > client > > > > > side > > > > > > Case 1 -- we retry EndTxn -- it is the same producer ID and > epoch > > - > > > 1 > > > > > this > > > > > > would fence the producer > > > > > > Case 2 -- we don't retry EndTxn and use the new producer id which > > > would > > > > > > result in InvalidPidMappingException > > > > > > > > > > > > Maybe we can have special handling for when a server downgrades. > > When > > > > it > > > > > > reconnects we could get an API version request showing KIP-890 > > part 2 > > > > is > > > > > > not supported. In that case, we can call initProducerId to abort > > the > > > > > > transaction. (In the overflow case, this correctly gives us a new > > > > > producer > > > > > > ID) > > > > > > > > > > > > I guess the corresponding case would be where the *complete > marker > > > *is > > > > > > written but the endTxn is not received by the client and the > server > > > > > > downgrades? This would result in the transaction coordinator > having > > > the > > > > > new > > > > > > ID and not the old one. If the client retries, it will receive > an > > > > > > InvalidPidMappingException. The InitProducerId scenario above > would > > > > help > > > > > > here too. > > > > > > > > > > > > To be clear, my compatibility story is meant to support > downgrades > > > > server > > > > > > side in keeping the transactional correctness. Keeping the client > > > from > > > > > > fencing itself is not the priority. > > > > > > > > > > > > Hope this helps. I can also add text in the KIP about > > InitProducerId > > > if > > > > > we > > > > > > think that fixes some edge cases. > > > > > > > > > > > > Justine > > > > > > > > > > > > On Fri, Jan 12, 2024 at 4:10 PM Jun Rao <j...@confluent.io.invalid > > > > > > > wrote: > > > > > > > > > > > > > Hi, Justine, > > > > > > > > > > > > > > Thanks for the reply. > > > > > > > > > > > > > > I agree that we don't need to optimize for fencing during > > > downgrades. > > > > > > > Regarding consistency, there are two possible cases: (1) the > > > prepare > > > > > > marker > > > > > > > is written, but the endTxn response is not received by the > client > > > > when > > > > > > the > > > > > > > server downgrades; (2) the prepare marker is written, the > endTxn > > > > > > response > > > > > > > is received by the client when the server downgrades. In (1), > the > > > > > client > > > > > > > will have the old produce Id and in (2), the client will have > the > > > new > > > > > > > produce Id. If we downgrade right after the prepare marker, we > > > can't > > > > be > > > > > > > consistent to both (1) and (2) since we can only put one value > in > > > the > > > > > > > existing produce Id field. It's also not clear which case is > more > > > > > likely. > > > > > > > So we could probably be consistent with either case. By putting > > the > > > > new > > > > > > > producer Id in the prepare marker, we are consistent with case > > (2) > > > > and > > > > > it > > > > > > > also has the slight benefit that the produce field in the > prepare > > > and > > > > > > > complete marker are consistent in the overflow case. > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > On Fri, Jan 12, 2024 at 3:11 PM Justine Olshan > > > > > > > <jols...@confluent.io.invalid> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Jun, > > > > > > > > > > > > > > > > In the case you describe, we would need to have a delayed > > > request, > > > > > > send a > > > > > > > > successful EndTxn, and a successful AddPartitionsToTxn and > then > > > > have > > > > > > the > > > > > > > > delayed EndTxn request go through for a given producer. > > > > > > > > I'm trying to figure out if it is possible for the client to > > > > > transition > > > > > > > if > > > > > > > > a previous request is delayed somewhere. But yes, in this > case > > I > > > > > think > > > > > > we > > > > > > > > would fence the client. > > > > > > > > > > > > > > > > Not for the overflow case. In the overflow case, the producer > > ID > > > > and > > > > > > the > > > > > > > > epoch are different on the marker and on the new transaction. > > So > > > we > > > > > > want > > > > > > > > the marker to use the max epoch but the new transaction > should > > > > start > > > > > > > with > > > > > > > > the new ID and epoch 0 in the transactional state. > > > > > > > > > > > > > > > > In the server downgrade case, we want to see the producer ID > as > > > > that > > > > > is > > > > > > > > what the client will have. If we complete the commit, and the > > > > > > transaction > > > > > > > > state is reloaded, we need the new producer ID in the state > so > > > > there > > > > > > > isn't > > > > > > > > an invalid producer ID mapping. > > > > > > > > The server downgrade cases are considering transactional > > > > correctness > > > > > > and > > > > > > > > not regressing from previous behavior -- and are not > concerned > > > > about > > > > > > > > supporting the safety from fencing retries (as we have > > downgraded > > > > so > > > > > we > > > > > > > > don't need to support). Perhaps this is a trade off, but I > > think > > > it > > > > > is > > > > > > > the > > > > > > > > right one. > > > > > > > > > > > > > > > > (If the client downgrades, it will have restarted and it is > ok > > > for > > > > it > > > > > > to > > > > > > > > have a new producer ID too). > > > > > > > > > > > > > > > > Justine > > > > > > > > > > > > > > > > On Fri, Jan 12, 2024 at 11:42 AM Jun Rao > > > <j...@confluent.io.invalid > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi, Justine, > > > > > > > > > > > > > > > > > > Thanks for the reply. > > > > > > > > > > > > > > > > > > 101.4 "If the marker is written by the new client, we can > as > > I > > > > > > > mentioned > > > > > > > > in > > > > > > > > > the last email guarantee that any EndTxn requests with the > > same > > > > > epoch > > > > > > > are > > > > > > > > > from the same producer and the same transaction. Then we > > don't > > > > have > > > > > > to > > > > > > > > > return a fenced error but can handle gracefully as > described > > in > > > > the > > > > > > > KIP." > > > > > > > > > When a delayed EndTnx request is processed, the txn state > > could > > > > be > > > > > > > > ongoing > > > > > > > > > for the next txn. I guess in this case we still return the > > > fenced > > > > > > error > > > > > > > > for > > > > > > > > > the delayed request? > > > > > > > > > > > > > > > > > > 102. Sorry, my question was inaccurate. What you described > is > > > > > > accurate. > > > > > > > > > "The downgrade compatibility I mention is that we keep the > > same > > > > > > > producer > > > > > > > > ID > > > > > > > > > and epoch in the main (non-tagged) fields as we did before > > the > > > > code > > > > > > on > > > > > > > > the > > > > > > > > > server side." If we want to do this, it seems that we > should > > > use > > > > > the > > > > > > > > > current produce Id and max epoch in the existing producerId > > and > > > > > > > > > producerEpoch fields for both the prepare and the complete > > > > marker, > > > > > > > right? > > > > > > > > > The downgrade can happen after the complete marker is > > written. > > > > With > > > > > > > what > > > > > > > > > you described, the downgraded coordinator will see the new > > > > produce > > > > > Id > > > > > > > > > instead of the old one. > > > > > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > On Fri, Jan 12, 2024 at 10:44 AM Justine Olshan > > > > > > > > > <jols...@confluent.io.invalid> wrote: > > > > > > > > > > > > > > > > > > > Hi Jun, > > > > > > > > > > > > > > > > > > > > I can update the description. > > > > > > > > > > > > > > > > > > > > I believe your second point is mentioned in the KIP. I > can > > > add > > > > > more > > > > > > > > text > > > > > > > > > on > > > > > > > > > > this if it is helpful. > > > > > > > > > > > The delayed message case can also violate EOS if the > > > delayed > > > > > > > message > > > > > > > > > > comes in after the next addPartitionsToTxn request comes > > in. > > > > > > > > Effectively > > > > > > > > > we > > > > > > > > > > may see a message from a previous (aborted) transaction > > > become > > > > > part > > > > > > > of > > > > > > > > > the > > > > > > > > > > next transaction. > > > > > > > > > > > > > > > > > > > > If the marker is written by the new client, we can as I > > > > mentioned > > > > > > in > > > > > > > > the > > > > > > > > > > last email guarantee that any EndTxn requests with the > same > > > > epoch > > > > > > are > > > > > > > > > from > > > > > > > > > > the same producer and the same transaction. Then we don't > > > have > > > > to > > > > > > > > return > > > > > > > > > a > > > > > > > > > > fenced error but can handle gracefully as described in > the > > > KIP. > > > > > > > > > > I don't think a boolean is useful since it is directly > > > encoded > > > > by > > > > > > the > > > > > > > > > > existence or lack of the tagged field being written. > > > > > > > > > > In the prepare marker we will have the same producer ID > in > > > the > > > > > > > > non-tagged > > > > > > > > > > field. In the Complete state we may not. > > > > > > > > > > I'm not sure why the ongoing state matters for this KIP. > It > > > > does > > > > > > > matter > > > > > > > > > for > > > > > > > > > > KIP-939. > > > > > > > > > > > > > > > > > > > > I'm not sure what you are referring to about writing the > > > > previous > > > > > > > > > producer > > > > > > > > > > ID in the prepare marker. This is not in the KIP. > > > > > > > > > > In the overflow case, we write the nextProducerId in the > > > > prepare > > > > > > > state. > > > > > > > > > > This is so we know what we assigned when we reload the > > > > > transaction > > > > > > > log. > > > > > > > > > > Once we complete, we transition this ID to the main > > > (non-tagged > > > > > > > field) > > > > > > > > > and > > > > > > > > > > have the previous producer ID field filled in. This is so > > we > > > > can > > > > > > > > identify > > > > > > > > > > in a retry case the operation completed successfully and > we > > > > don't > > > > > > > fence > > > > > > > > > our > > > > > > > > > > producer. The downgrade compatibility I mention is that > we > > > keep > > > > > the > > > > > > > > same > > > > > > > > > > producer ID and epoch in the main (non-tagged) fields as > we > > > did > > > > > > > before > > > > > > > > > the > > > > > > > > > > code on the server side. If the server downgrades, we are > > > still > > > > > > > > > compatible. > > > > > > > > > > This addresses both the prepare and complete state > > > downgrades. > > > > > > > > > > > > > > > > > > > > Justine > > > > > > > > > > > > > > > > > > > > On Fri, Jan 12, 2024 at 10:21 AM Jun Rao > > > > > <j...@confluent.io.invalid > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi, Justine, > > > > > > > > > > > > > > > > > > > > > > Thanks for the reply. Sorry for the delay. I have a few > > > more > > > > > > > > comments. > > > > > > > > > > > > > > > > > > > > > > 110. I think the motivation section could be improved. > > One > > > of > > > > > the > > > > > > > > > > > motivations listed by the KIP is "This can happen when > a > > > > > message > > > > > > > gets > > > > > > > > > > stuck > > > > > > > > > > > or delayed due to networking issues or a network > > partition, > > > > the > > > > > > > > > > transaction > > > > > > > > > > > aborts, and then the delayed message finally comes > in.". > > > This > > > > > > seems > > > > > > > > not > > > > > > > > > > > very accurate. Without KIP-890, currently, if the > > > coordinator > > > > > > times > > > > > > > > out > > > > > > > > > > and > > > > > > > > > > > aborts an ongoing transaction, it already bumps up the > > > epoch > > > > in > > > > > > the > > > > > > > > > > marker, > > > > > > > > > > > which prevents the delayed produce message from being > > added > > > > to > > > > > > the > > > > > > > > user > > > > > > > > > > > partition. What can cause a hanging transaction is that > > the > > > > > > > producer > > > > > > > > > > > completes (either aborts or commits) a transaction > before > > > > > > > receiving a > > > > > > > > > > > successful ack on messages published in the same txn. > In > > > this > > > > > > case, > > > > > > > > > it's > > > > > > > > > > > possible for the delayed message to be appended to the > > > > > partition > > > > > > > > after > > > > > > > > > > the > > > > > > > > > > > marker, causing a transaction to hang. > > > > > > > > > > > > > > > > > > > > > > A similar issue (not mentioned in the motivation) could > > > > happen > > > > > on > > > > > > > the > > > > > > > > > > > marker in the coordinator's log. For example, it's > > possible > > > > for > > > > > > an > > > > > > > > > > > EndTxnRequest to be delayed on the coordinator. By the > > time > > > > the > > > > > > > > delayed > > > > > > > > > > > EndTxnRequest is processed, it's possible that the > > previous > > > > txn > > > > > > has > > > > > > > > > > already > > > > > > > > > > > completed and a new txn has started. Currently, since > the > > > > epoch > > > > > > is > > > > > > > > not > > > > > > > > > > > bumped on every txn, the delayed EndTxnRequest will add > > an > > > > > > > unexpected > > > > > > > > > > > prepare marker (and eventually a complete marker) to > the > > > > > ongoing > > > > > > > txn. > > > > > > > > > > This > > > > > > > > > > > won't cause the transaction to hang, but it will break > > the > > > > EoS > > > > > > > > > semantic. > > > > > > > > > > > The proposal in this KIP will address this issue too. > > > > > > > > > > > > > > > > > > > > > > 101. "However, I was writing it so that we can > > distinguish > > > > > > between > > > > > > > > > > > old clients where we don't have the ability do this > > > operation > > > > > and > > > > > > > new > > > > > > > > > > > clients that can. (Old clients don't bump the epoch on > > > > commit, > > > > > so > > > > > > > we > > > > > > > > > > can't > > > > > > > > > > > say for sure the write belongs to the given > > transaction)." > > > > > > > > > > > 101.1 I am wondering why we need to distinguish whether > > the > > > > > > marker > > > > > > > is > > > > > > > > > > > written by the old and the new client. Could you > describe > > > > what > > > > > we > > > > > > > do > > > > > > > > > > > differently if we know the marker is written by the new > > > > client? > > > > > > > > > > > 101.2 If we do need a way to distinguish whether the > > marker > > > > is > > > > > > > > written > > > > > > > > > by > > > > > > > > > > > the old and the new client. Would it be simpler to just > > > > > > introduce a > > > > > > > > > > boolean > > > > > > > > > > > field instead of indirectly through the previous > produce > > ID > > > > > > field? > > > > > > > > > > > 101.3 It's not clear to me why we only add the previous > > > > produce > > > > > > ID > > > > > > > > > field > > > > > > > > > > in > > > > > > > > > > > the complete marker, but not in the prepare marker. If > we > > > > want > > > > > to > > > > > > > > know > > > > > > > > > > > whether a marker is written by the new client or not, > it > > > > seems > > > > > > that > > > > > > > > we > > > > > > > > > > want > > > > > > > > > > > to do this consistently for all markers. > > > > > > > > > > > 101.4 What about the TransactionLogValue record > > > representing > > > > > the > > > > > > > > > ongoing > > > > > > > > > > > state? Should we also distinguish whether it's written > by > > > the > > > > > old > > > > > > > or > > > > > > > > > the > > > > > > > > > > > new client? > > > > > > > > > > > > > > > > > > > > > > 102. In the overflow case, it's still not clear to me > why > > > we > > > > > > write > > > > > > > > the > > > > > > > > > > > previous produce Id in the prepare marker while writing > > the > > > > > next > > > > > > > > > produce > > > > > > > > > > Id > > > > > > > > > > > in the complete marker. You mentioned that it's for > > > > > downgrading. > > > > > > > > > However, > > > > > > > > > > > we could downgrade with either the prepare marker or > the > > > > > complete > > > > > > > > > marker. > > > > > > > > > > > In either case, the downgraded coordinator should see > the > > > > same > > > > > > > > produce > > > > > > > > > id > > > > > > > > > > > (probably the previous produce Id), right? > > > > > > > > > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > > > On Wed, Dec 20, 2023 at 6:00 PM Justine Olshan > > > > > > > > > > > <jols...@confluent.io.invalid> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hey Jun, > > > > > > > > > > > > > > > > > > > > > > > > Thanks for taking a look at the KIP again. > > > > > > > > > > > > > > > > > > > > > > > > 100. For the epoch overflow case, only the marker > will > > > have > > > > > max > > > > > > > > > epoch. > > > > > > > > > > > This > > > > > > > > > > > > keeps the behavior of the rest of the markers where > the > > > > last > > > > > > > marker > > > > > > > > > is > > > > > > > > > > > the > > > > > > > > > > > > epoch of the transaction records + 1. > > > > > > > > > > > > > > > > > > > > > > > > 101. You are correct that we don't need to write the > > > > producer > > > > > > ID > > > > > > > > > since > > > > > > > > > > it > > > > > > > > > > > > is the same. However, I was writing it so that we can > > > > > > distinguish > > > > > > > > > > between > > > > > > > > > > > > old clients where we don't have the ability do this > > > > operation > > > > > > and > > > > > > > > new > > > > > > > > > > > > clients that can. (Old clients don't bump the epoch > on > > > > > commit, > > > > > > so > > > > > > > > we > > > > > > > > > > > can't > > > > > > > > > > > > say for sure the write belongs to the given > > transaction). > > > > If > > > > > we > > > > > > > > > receive > > > > > > > > > > > an > > > > > > > > > > > > EndTxn request from a new client, we will fill this > > > field. > > > > We > > > > > > can > > > > > > > > > > > guarantee > > > > > > > > > > > > that any EndTxn requests with the same epoch are from > > the > > > > > same > > > > > > > > > producer > > > > > > > > > > > and > > > > > > > > > > > > the same transaction. > > > > > > > > > > > > > > > > > > > > > > > > 102. In prepare phase, we have the same producer ID > and > > > > epoch > > > > > > we > > > > > > > > > always > > > > > > > > > > > > had. It is the producer ID and epoch that are on the > > > > marker. > > > > > In > > > > > > > > > commit > > > > > > > > > > > > phase, we stay the same unless it is the overflow > case. > > > In > > > > > that > > > > > > > > case, > > > > > > > > > > we > > > > > > > > > > > > set the producer ID to the new one we generated and > > epoch > > > > to > > > > > 0 > > > > > > > > after > > > > > > > > > > > > complete. This is for downgrade compatibility. The > > tagged > > > > > > fields > > > > > > > > are > > > > > > > > > > just > > > > > > > > > > > > safety guards for retries and failovers. > > > > > > > > > > > > > > > > > > > > > > > > In prepare phase for epoch overflow case only we > store > > > the > > > > > next > > > > > > > > > > producer > > > > > > > > > > > > ID. This is for the case where we reload the > > transaction > > > > > > > > coordinator > > > > > > > > > in > > > > > > > > > > > > prepare state. Once the transaction is committed, we > > can > > > > use > > > > > > the > > > > > > > > > > producer > > > > > > > > > > > > ID the client already is using. > > > > > > > > > > > > > > > > > > > > > > > > In commit phase, we store the previous producer ID in > > > case > > > > of > > > > > > > > > retries. > > > > > > > > > > > > > > > > > > > > > > > > I think it is easier to think of it as just how we > were > > > > > storing > > > > > > > > > > producer > > > > > > > > > > > ID > > > > > > > > > > > > and epoch before, with some extra bookeeping and edge > > > case > > > > > > > handling > > > > > > > > > in > > > > > > > > > > > the > > > > > > > > > > > > tagged fields. We have to do it this way for > > > compatibility > > > > > with > > > > > > > > > > > downgrades. > > > > > > > > > > > > > > > > > > > > > > > > 103. Next producer ID is for prepare status and > > previous > > > > > > producer > > > > > > > > ID > > > > > > > > > is > > > > > > > > > > > for > > > > > > > > > > > > after complete. The reason why we need two separate > > > > (tagged) > > > > > > > fields > > > > > > > > > is > > > > > > > > > > > for > > > > > > > > > > > > backwards compatibility. We need to keep the same > > > semantics > > > > > for > > > > > > > the > > > > > > > > > > > > non-tagged field in case we downgrade. > > > > > > > > > > > > > > > > > > > > > > > > 104. We set the fields as we do in the transactional > > > state > > > > > (as > > > > > > we > > > > > > > > > need > > > > > > > > > > to > > > > > > > > > > > > do this for compatibility -- if we downgrade, we will > > > only > > > > > have > > > > > > > the > > > > > > > > > > > > non-tagged fields) It will be the old producer ID and > > max > > > > > > epoch. > > > > > > > > > > > > > > > > > > > > > > > > Hope this helps. Let me know if you have further > > > questions. > > > > > > > > > > > > > > > > > > > > > > > > Justine > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Dec 20, 2023 at 3:33 PM Jun Rao > > > > > > <j...@confluent.io.invalid > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Hi, Justine, > > > > > > > > > > > > > > > > > > > > > > > > > > It seems that you have made some changes to KIP-890 > > > since > > > > > the > > > > > > > > vote. > > > > > > > > > > In > > > > > > > > > > > > > particular, we are changing the format of > > > > > > TransactionLogValue. > > > > > > > A > > > > > > > > > few > > > > > > > > > > > > > comments related to that. > > > > > > > > > > > > > > > > > > > > > > > > > > 100. Just to be clear. The overflow case (i.e. > when a > > > new > > > > > > > > > producerId > > > > > > > > > > is > > > > > > > > > > > > > generated) is when the current epoch equals to max > - > > 1 > > > > and > > > > > > not > > > > > > > > max? > > > > > > > > > > > > > > > > > > > > > > > > > > 101. For the "not epoch overflow" case, we write > the > > > > > previous > > > > > > > ID > > > > > > > > in > > > > > > > > > > the > > > > > > > > > > > > > tagged field in the complete phase. Do we need to > do > > > that > > > > > > since > > > > > > > > > > produce > > > > > > > > > > > > id > > > > > > > > > > > > > doesn't change in this case? > > > > > > > > > > > > > > > > > > > > > > > > > > 102. It seems that the meaning for the > > > > > > ProducerId/ProducerEpoch > > > > > > > > > > fields > > > > > > > > > > > in > > > > > > > > > > > > > TransactionLogValue changes depending on the > > > > > > TransactionStatus. > > > > > > > > > When > > > > > > > > > > > > > the TransactionStatus is ongoing, they represent > the > > > > > current > > > > > > > > > > ProducerId > > > > > > > > > > > > and > > > > > > > > > > > > > the current ProducerEpoch. When the > TransactionStatus > > > is > > > > > > > > > > > > > PrepareCommit/PrepareAbort, they represent the > > current > > > > > > > ProducerId > > > > > > > > > and > > > > > > > > > > > the > > > > > > > > > > > > > next ProducerEpoch. When the TransactionStatus is > > > > > > Commit/Abort, > > > > > > > > > they > > > > > > > > > > > > > further depend on whether the epoch overflows or > not. > > > If > > > > > > there > > > > > > > is > > > > > > > > > no > > > > > > > > > > > > > overflow, they represent the current ProducerId > and > > > the > > > > > next > > > > > > > > > > > > ProducerEpoch > > > > > > > > > > > > > (max). Otherwise, they represent the newly > generated > > > > > > ProducerId > > > > > > > > > and a > > > > > > > > > > > > > ProducerEpoch of 0. Is that right? This seems not > > easy > > > to > > > > > > > > > understand. > > > > > > > > > > > > Could > > > > > > > > > > > > > we provide some examples like what Artem has done > in > > > > > KIP-939? > > > > > > > > Have > > > > > > > > > we > > > > > > > > > > > > > considered a simpler design where > > > > ProducerId/ProducerEpoch > > > > > > > always > > > > > > > > > > > > represent > > > > > > > > > > > > > the same value (e.g. for the current transaction) > > > > > independent > > > > > > > of > > > > > > > > > the > > > > > > > > > > > > > TransactionStatus and epoch overflow? > > > > > > > > > > > > > > > > > > > > > > > > > > 103. It's not clear to me why we need 3 fields: > > > > ProducerId, > > > > > > > > > > > > PrevProducerId, > > > > > > > > > > > > > NextProducerId. Could we just have ProducerId and > > > > > > > NextProducerId? > > > > > > > > > > > > > > > > > > > > > > > > > > 104. For WriteTxnMarkerRequests, if the producer > > epoch > > > > > > > overflows, > > > > > > > > > > what > > > > > > > > > > > do > > > > > > > > > > > > > we set the producerId and the producerEpoch? > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >