For what it's worth -- the sequence number is not calculated "baseOffset/baseSequence + offset delta" but rather by monotonically increasing for a given epoch. If the epoch is bumped, we reset back to zero. This may mean that the offset and sequence may match, but do not strictly need to be the same. The sequence number will also always come from the client and be in the produce records sent to the Kafka broker.
As for offsets, there is some code in the log layer that maintains the log end offset and assigns offsets to the records. The produce handling on the leader should typically assign the offset. I believe you can find that code here: https://github.com/apache/kafka/blob/b9a45546a7918799b6fb3c0fe63b56f47d8fcba9/core/src/main/scala/kafka/log/UnifiedLog.scala#L766 Justine On Tue, Aug 1, 2023 at 11:38 AM Matthias J. Sax <mj...@apache.org> wrote: > The _offset_ is the position of the record in the partition. > > The _sequence number_ is a unique ID that allows broker to de-duplicate > messages. It requires the producer to implement the idempotency protocol > (part of Kafka transactions); thus, sequence numbers are optional and as > long as you don't want to support idempotent writes, you don't need to > worry about them. (If you want to dig into details, checkout KIP-98 that > is the original KIP about Kafka TX). > > HTH, > -Matthias > > On 8/1/23 2:19 AM, tison wrote: > > Hi, > > > > I'm wringing a Kafka API Rust codec library[1] to understand how Kafka > > models its concepts and how the core business logic works. > > > > During implementing the codec for Records[2], I saw a twins of fields > > "sequence" and "offset". Both of them are calculated by > > baseOffset/baseSequence + offset delta. Then I'm a bit confused how to > deal > > with them properly - what's the difference between these two concepts > > logically? > > > > Also, to understand how the core business logic works, I write a simple > > server based on my codec library, and observe that the server may need to > > update offset for records produced. How does Kafka set the correct offset > > for each produced records? And how does Kafka maintain the calculation > for > > offset and sequence during these modifications? > > > > I'll appreciate if anyone can answer the question or give some insights > :D > > > > Best, > > tison. > > > > [1] https://github.com/tisonkun/kafka-api > > [2] https://kafka.apache.org/documentation/#messageformat > > >