Hi, Justine, Thanks for the reply.
101.4 "If the marker is written by the new client, we can as I mentioned in the last email guarantee that any EndTxn requests with the same epoch are from the same producer and the same transaction. Then we don't have to return a fenced error but can handle gracefully as described in the KIP." When a delayed EndTnx request is processed, the txn state could be ongoing for the next txn. I guess in this case we still return the fenced error for the delayed request? 102. Sorry, my question was inaccurate. What you described is accurate. "The downgrade compatibility I mention is that we keep the same producer ID and epoch in the main (non-tagged) fields as we did before the code on the server side." If we want to do this, it seems that we should use the current produce Id and max epoch in the existing producerId and producerEpoch fields for both the prepare and the complete marker, right? The downgrade can happen after the complete marker is written. With what you described, the downgraded coordinator will see the new produce Id instead of the old one. Jun On Fri, Jan 12, 2024 at 10:44 AM Justine Olshan <jols...@confluent.io.invalid> wrote: > Hi Jun, > > I can update the description. > > I believe your second point is mentioned in the KIP. I can add more text on > this if it is helpful. > > The delayed message case can also violate EOS if the delayed message > comes in after the next addPartitionsToTxn request comes in. Effectively we > may see a message from a previous (aborted) transaction become part of the > next transaction. > > If the marker is written by the new client, we can as I mentioned in the > last email guarantee that any EndTxn requests with the same epoch are from > the same producer and the same transaction. Then we don't have to return a > fenced error but can handle gracefully as described in the KIP. > I don't think a boolean is useful since it is directly encoded by the > existence or lack of the tagged field being written. > In the prepare marker we will have the same producer ID in the non-tagged > field. In the Complete state we may not. > I'm not sure why the ongoing state matters for this KIP. It does matter for > KIP-939. > > I'm not sure what you are referring to about writing the previous producer > ID in the prepare marker. This is not in the KIP. > In the overflow case, we write the nextProducerId in the prepare state. > This is so we know what we assigned when we reload the transaction log. > Once we complete, we transition this ID to the main (non-tagged field) and > have the previous producer ID field filled in. This is so we can identify > in a retry case the operation completed successfully and we don't fence our > producer. The downgrade compatibility I mention is that we keep the same > producer ID and epoch in the main (non-tagged) fields as we did before the > code on the server side. If the server downgrades, we are still compatible. > This addresses both the prepare and complete state downgrades. > > Justine > > On Fri, Jan 12, 2024 at 10:21 AM Jun Rao <j...@confluent.io.invalid> wrote: > > > Hi, Justine, > > > > Thanks for the reply. Sorry for the delay. I have a few more comments. > > > > 110. I think the motivation section could be improved. One of the > > motivations listed by the KIP is "This can happen when a message gets > stuck > > or delayed due to networking issues or a network partition, the > transaction > > aborts, and then the delayed message finally comes in.". This seems not > > very accurate. Without KIP-890, currently, if the coordinator times out > and > > aborts an ongoing transaction, it already bumps up the epoch in the > marker, > > which prevents the delayed produce message from being added to the user > > partition. What can cause a hanging transaction is that the producer > > completes (either aborts or commits) a transaction before receiving a > > successful ack on messages published in the same txn. In this case, it's > > possible for the delayed message to be appended to the partition after > the > > marker, causing a transaction to hang. > > > > A similar issue (not mentioned in the motivation) could happen on the > > marker in the coordinator's log. For example, it's possible for an > > EndTxnRequest to be delayed on the coordinator. By the time the delayed > > EndTxnRequest is processed, it's possible that the previous txn has > already > > completed and a new txn has started. Currently, since the epoch is not > > bumped on every txn, the delayed EndTxnRequest will add an unexpected > > prepare marker (and eventually a complete marker) to the ongoing txn. > This > > won't cause the transaction to hang, but it will break the EoS semantic. > > The proposal in this KIP will address this issue too. > > > > 101. "However, I was writing it so that we can distinguish between > > old clients where we don't have the ability do this operation and new > > clients that can. (Old clients don't bump the epoch on commit, so we > can't > > say for sure the write belongs to the given transaction)." > > 101.1 I am wondering why we need to distinguish whether the marker is > > written by the old and the new client. Could you describe what we do > > differently if we know the marker is written by the new client? > > 101.2 If we do need a way to distinguish whether the marker is written by > > the old and the new client. Would it be simpler to just introduce a > boolean > > field instead of indirectly through the previous produce ID field? > > 101.3 It's not clear to me why we only add the previous produce ID field > in > > the complete marker, but not in the prepare marker. If we want to know > > whether a marker is written by the new client or not, it seems that we > want > > to do this consistently for all markers. > > 101.4 What about the TransactionLogValue record representing the ongoing > > state? Should we also distinguish whether it's written by the old or the > > new client? > > > > 102. In the overflow case, it's still not clear to me why we write the > > previous produce Id in the prepare marker while writing the next produce > Id > > in the complete marker. You mentioned that it's for downgrading. However, > > we could downgrade with either the prepare marker or the complete marker. > > In either case, the downgraded coordinator should see the same produce id > > (probably the previous produce Id), right? > > > > Jun > > > > On Wed, Dec 20, 2023 at 6:00 PM Justine Olshan > > <jols...@confluent.io.invalid> > > wrote: > > > > > Hey Jun, > > > > > > Thanks for taking a look at the KIP again. > > > > > > 100. For the epoch overflow case, only the marker will have max epoch. > > This > > > keeps the behavior of the rest of the markers where the last marker is > > the > > > epoch of the transaction records + 1. > > > > > > 101. You are correct that we don't need to write the producer ID since > it > > > is the same. However, I was writing it so that we can distinguish > between > > > old clients where we don't have the ability do this operation and new > > > clients that can. (Old clients don't bump the epoch on commit, so we > > can't > > > say for sure the write belongs to the given transaction). If we receive > > an > > > EndTxn request from a new client, we will fill this field. We can > > guarantee > > > that any EndTxn requests with the same epoch are from the same producer > > and > > > the same transaction. > > > > > > 102. In prepare phase, we have the same producer ID and epoch we always > > > had. It is the producer ID and epoch that are on the marker. In commit > > > phase, we stay the same unless it is the overflow case. In that case, > we > > > set the producer ID to the new one we generated and epoch to 0 after > > > complete. This is for downgrade compatibility. The tagged fields are > just > > > safety guards for retries and failovers. > > > > > > In prepare phase for epoch overflow case only we store the next > producer > > > ID. This is for the case where we reload the transaction coordinator in > > > prepare state. Once the transaction is committed, we can use the > producer > > > ID the client already is using. > > > > > > In commit phase, we store the previous producer ID in case of retries. > > > > > > I think it is easier to think of it as just how we were storing > producer > > ID > > > and epoch before, with some extra bookeeping and edge case handling in > > the > > > tagged fields. We have to do it this way for compatibility with > > downgrades. > > > > > > 103. Next producer ID is for prepare status and previous producer ID is > > for > > > after complete. The reason why we need two separate (tagged) fields is > > for > > > backwards compatibility. We need to keep the same semantics for the > > > non-tagged field in case we downgrade. > > > > > > 104. We set the fields as we do in the transactional state (as we need > to > > > do this for compatibility -- if we downgrade, we will only have the > > > non-tagged fields) It will be the old producer ID and max epoch. > > > > > > Hope this helps. Let me know if you have further questions. > > > > > > Justine > > > > > > On Wed, Dec 20, 2023 at 3:33 PM Jun Rao <j...@confluent.io.invalid> > > wrote: > > > > > > > Hi, Justine, > > > > > > > > It seems that you have made some changes to KIP-890 since the vote. > In > > > > particular, we are changing the format of TransactionLogValue. A few > > > > comments related to that. > > > > > > > > 100. Just to be clear. The overflow case (i.e. when a new producerId > is > > > > generated) is when the current epoch equals to max - 1 and not max? > > > > > > > > 101. For the "not epoch overflow" case, we write the previous ID in > the > > > > tagged field in the complete phase. Do we need to do that since > produce > > > id > > > > doesn't change in this case? > > > > > > > > 102. It seems that the meaning for the ProducerId/ProducerEpoch > fields > > in > > > > TransactionLogValue changes depending on the TransactionStatus. When > > > > the TransactionStatus is ongoing, they represent the current > ProducerId > > > and > > > > the current ProducerEpoch. When the TransactionStatus is > > > > PrepareCommit/PrepareAbort, they represent the current ProducerId and > > the > > > > next ProducerEpoch. When the TransactionStatus is Commit/Abort, they > > > > further depend on whether the epoch overflows or not. If there is no > > > > overflow, they represent the current ProducerId and the next > > > ProducerEpoch > > > > (max). Otherwise, they represent the newly generated ProducerId and a > > > > ProducerEpoch of 0. Is that right? This seems not easy to understand. > > > Could > > > > we provide some examples like what Artem has done in KIP-939? Have we > > > > considered a simpler design where ProducerId/ProducerEpoch always > > > represent > > > > the same value (e.g. for the current transaction) independent of the > > > > TransactionStatus and epoch overflow? > > > > > > > > 103. It's not clear to me why we need 3 fields: ProducerId, > > > PrevProducerId, > > > > NextProducerId. Could we just have ProducerId and NextProducerId? > > > > > > > > 104. For WriteTxnMarkerRequests, if the producer epoch overflows, > what > > do > > > > we set the producerId and the producerEpoch? > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > > > > > > > > >