Justine Olshan created KAFKA-19446:
--------------------------------------
Summary: TV2 late marker can violate EOS guarantees.
Key: KAFKA-19446
URL: https://issues.apache.org/jira/browse/KAFKA-19446
Project: Kafka
Issue Type: Task
Affects Versions: 4.0.0, 4.1.0
Reporter: Justine Olshan
Assignee: Justine Olshan
One case we missed in KIP-890 is if a late arriving WriteTxnMarkerRequest comes
in to a partition for a transaction using TV2.
Because we write a marker with epoch +1, we send the request with epoch +1. Due
to the somewhat relaxed check on epochs at the log layer
([https://github.com/apache/kafka/blob/fd70290633191b6f53a9d4ddb24e3a8b619fcd3f/storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerAppendInfo.java#L211)]
, we can actually accept a late arriving request for the previous transaction
since the epoch will be the same.
We should tighten up this check to not allow the same epoch when using TV2. In
other words, the marker should always be >= epoch + 1 the current producer
state epoch. (The epoch can be greater than +1 if we restart the producer and
bump epoch.) We just need a good way to tell if a marker is meant for a TV2
transaction.
This + 1 works even if we didn't produce records, since the previous marker
will update the epoch
--
This message was sent by Atlassian Jira
(v8.20.10#820010)