Hi - we've observed that after committing a transaction, commit marker
records are appended to all topic partitions written to in the transaction,
these commit markers are assigned offsets in the partition, and
KafkaConsumer.endOffsets will return the offsets of commit markers if they
are the last records appended to the partition, although the consumer will
never consume records with those commit marker offsets.

This can lead to problems, for example on startup a system gets current end
offsets for a topic's partitions and tracks the current offsets a consumer
has consumed, it will never be "caught up" to the last offsets since they
are the commit marker offsets, until newer records are written to those
partitions. A workaround to this specific problem is to assume lastOffset =
endOffset - 2, but at best this is surprising until you learn about commit
markers.

How have others dealt with this? Are we missing anything important in the
above? Could the docs include more details about commit markers? Should
KafkaConsumer.endOffsets account for commit markers somehow?

Thanks,
Zach

Reply via email to