Hi Jun, I can update the description.
I believe your second point is mentioned in the KIP. I can add more text on this if it is helpful. > The delayed message case can also violate EOS if the delayed message comes in after the next addPartitionsToTxn request comes in. Effectively we may see a message from a previous (aborted) transaction become part of the next transaction. If the marker is written by the new client, we can as I mentioned in the last email guarantee that any EndTxn requests with the same epoch are from the same producer and the same transaction. Then we don't have to return a fenced error but can handle gracefully as described in the KIP. I don't think a boolean is useful since it is directly encoded by the existence or lack of the tagged field being written. In the prepare marker we will have the same producer ID in the non-tagged field. In the Complete state we may not. I'm not sure why the ongoing state matters for this KIP. It does matter for KIP-939. I'm not sure what you are referring to about writing the previous producer ID in the prepare marker. This is not in the KIP. In the overflow case, we write the nextProducerId in the prepare state. This is so we know what we assigned when we reload the transaction log. Once we complete, we transition this ID to the main (non-tagged field) and have the previous producer ID field filled in. This is so we can identify in a retry case the operation completed successfully and we don't fence our producer. The downgrade compatibility I mention is that we keep the same producer ID and epoch in the main (non-tagged) fields as we did before the code on the server side. If the server downgrades, we are still compatible. This addresses both the prepare and complete state downgrades. Justine On Fri, Jan 12, 2024 at 10:21 AM Jun Rao <j...@confluent.io.invalid> wrote: > Hi, Justine, > > Thanks for the reply. Sorry for the delay. I have a few more comments. > > 110. I think the motivation section could be improved. One of the > motivations listed by the KIP is "This can happen when a message gets stuck > or delayed due to networking issues or a network partition, the transaction > aborts, and then the delayed message finally comes in.". This seems not > very accurate. Without KIP-890, currently, if the coordinator times out and > aborts an ongoing transaction, it already bumps up the epoch in the marker, > which prevents the delayed produce message from being added to the user > partition. What can cause a hanging transaction is that the producer > completes (either aborts or commits) a transaction before receiving a > successful ack on messages published in the same txn. In this case, it's > possible for the delayed message to be appended to the partition after the > marker, causing a transaction to hang. > > A similar issue (not mentioned in the motivation) could happen on the > marker in the coordinator's log. For example, it's possible for an > EndTxnRequest to be delayed on the coordinator. By the time the delayed > EndTxnRequest is processed, it's possible that the previous txn has already > completed and a new txn has started. Currently, since the epoch is not > bumped on every txn, the delayed EndTxnRequest will add an unexpected > prepare marker (and eventually a complete marker) to the ongoing txn. This > won't cause the transaction to hang, but it will break the EoS semantic. > The proposal in this KIP will address this issue too. > > 101. "However, I was writing it so that we can distinguish between > old clients where we don't have the ability do this operation and new > clients that can. (Old clients don't bump the epoch on commit, so we can't > say for sure the write belongs to the given transaction)." > 101.1 I am wondering why we need to distinguish whether the marker is > written by the old and the new client. Could you describe what we do > differently if we know the marker is written by the new client? > 101.2 If we do need a way to distinguish whether the marker is written by > the old and the new client. Would it be simpler to just introduce a boolean > field instead of indirectly through the previous produce ID field? > 101.3 It's not clear to me why we only add the previous produce ID field in > the complete marker, but not in the prepare marker. If we want to know > whether a marker is written by the new client or not, it seems that we want > to do this consistently for all markers. > 101.4 What about the TransactionLogValue record representing the ongoing > state? Should we also distinguish whether it's written by the old or the > new client? > > 102. In the overflow case, it's still not clear to me why we write the > previous produce Id in the prepare marker while writing the next produce Id > in the complete marker. You mentioned that it's for downgrading. However, > we could downgrade with either the prepare marker or the complete marker. > In either case, the downgraded coordinator should see the same produce id > (probably the previous produce Id), right? > > Jun > > On Wed, Dec 20, 2023 at 6:00 PM Justine Olshan > <jols...@confluent.io.invalid> > wrote: > > > Hey Jun, > > > > Thanks for taking a look at the KIP again. > > > > 100. For the epoch overflow case, only the marker will have max epoch. > This > > keeps the behavior of the rest of the markers where the last marker is > the > > epoch of the transaction records + 1. > > > > 101. You are correct that we don't need to write the producer ID since it > > is the same. However, I was writing it so that we can distinguish between > > old clients where we don't have the ability do this operation and new > > clients that can. (Old clients don't bump the epoch on commit, so we > can't > > say for sure the write belongs to the given transaction). If we receive > an > > EndTxn request from a new client, we will fill this field. We can > guarantee > > that any EndTxn requests with the same epoch are from the same producer > and > > the same transaction. > > > > 102. In prepare phase, we have the same producer ID and epoch we always > > had. It is the producer ID and epoch that are on the marker. In commit > > phase, we stay the same unless it is the overflow case. In that case, we > > set the producer ID to the new one we generated and epoch to 0 after > > complete. This is for downgrade compatibility. The tagged fields are just > > safety guards for retries and failovers. > > > > In prepare phase for epoch overflow case only we store the next producer > > ID. This is for the case where we reload the transaction coordinator in > > prepare state. Once the transaction is committed, we can use the producer > > ID the client already is using. > > > > In commit phase, we store the previous producer ID in case of retries. > > > > I think it is easier to think of it as just how we were storing producer > ID > > and epoch before, with some extra bookeeping and edge case handling in > the > > tagged fields. We have to do it this way for compatibility with > downgrades. > > > > 103. Next producer ID is for prepare status and previous producer ID is > for > > after complete. The reason why we need two separate (tagged) fields is > for > > backwards compatibility. We need to keep the same semantics for the > > non-tagged field in case we downgrade. > > > > 104. We set the fields as we do in the transactional state (as we need to > > do this for compatibility -- if we downgrade, we will only have the > > non-tagged fields) It will be the old producer ID and max epoch. > > > > Hope this helps. Let me know if you have further questions. > > > > Justine > > > > On Wed, Dec 20, 2023 at 3:33 PM Jun Rao <j...@confluent.io.invalid> > wrote: > > > > > Hi, Justine, > > > > > > It seems that you have made some changes to KIP-890 since the vote. In > > > particular, we are changing the format of TransactionLogValue. A few > > > comments related to that. > > > > > > 100. Just to be clear. The overflow case (i.e. when a new producerId is > > > generated) is when the current epoch equals to max - 1 and not max? > > > > > > 101. For the "not epoch overflow" case, we write the previous ID in the > > > tagged field in the complete phase. Do we need to do that since produce > > id > > > doesn't change in this case? > > > > > > 102. It seems that the meaning for the ProducerId/ProducerEpoch fields > in > > > TransactionLogValue changes depending on the TransactionStatus. When > > > the TransactionStatus is ongoing, they represent the current ProducerId > > and > > > the current ProducerEpoch. When the TransactionStatus is > > > PrepareCommit/PrepareAbort, they represent the current ProducerId and > the > > > next ProducerEpoch. When the TransactionStatus is Commit/Abort, they > > > further depend on whether the epoch overflows or not. If there is no > > > overflow, they represent the current ProducerId and the next > > ProducerEpoch > > > (max). Otherwise, they represent the newly generated ProducerId and a > > > ProducerEpoch of 0. Is that right? This seems not easy to understand. > > Could > > > we provide some examples like what Artem has done in KIP-939? Have we > > > considered a simpler design where ProducerId/ProducerEpoch always > > represent > > > the same value (e.g. for the current transaction) independent of the > > > TransactionStatus and epoch overflow? > > > > > > 103. It's not clear to me why we need 3 fields: ProducerId, > > PrevProducerId, > > > NextProducerId. Could we just have ProducerId and NextProducerId? > > > > > > 104. For WriteTxnMarkerRequests, if the producer epoch overflows, what > do > > > we set the producerId and the producerEpoch? > > > > > > Thanks, > > > > > > Jun > > > > > > > > > >