FrankYang0529 commented on code in PR #18012: URL: https://github.com/apache/kafka/pull/18012#discussion_r1907410734
########## storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java: ########## @@ -257,13 +257,21 @@ public void append(long largestOffset, if (largestTimestampMs > maxTimestampSoFar()) { maxTimestampAndOffsetSoFar = new TimestampOffset(largestTimestampMs, shallowOffsetOfMaxTimestamp); } - // append an entry to the index (if needed) + // append an entry to the timestamp index at MemoryRecords level (if needed) if (bytesSinceLastIndexEntry > indexIntervalBytes) { - offsetIndex().append(largestOffset, physicalPosition); timeIndex().maybeAppend(maxTimestampSoFar(), shallowOffsetOfMaxTimestampSoFar()); - bytesSinceLastIndexEntry = 0; } - bytesSinceLastIndexEntry += records.sizeInBytes(); + + // append an entry to the offset index at batches level (if needed) + for (RecordBatch batch : records.batches()) { + if (bytesSinceLastIndexEntry > indexIntervalBytes && + batch.lastOffset() >= offsetIndex().lastOffset()) { + offsetIndex().append(batch.lastOffset(), physicalPosition); Review Comment: Hi @junrao, thanks for review. I addressed both comments. For timestamp, it's not always monotonic in records, so checking offset by timestamp index is not as much as offset index. Probably, we can consider whether it's worth to add timestamp for each batch, because this operation introduces more cost. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org