[GitHub] [kafka] hachikuji commented on a change in pull request #9590: KAFKA-7556: KafkaConsumer.beginningOffsets does not return actual first offsets

GitBox Thu, 25 Feb 2021 13:01:07 -0800


hachikuji commented on a change in pull request #9590:
URL: https://github.com/apache/kafka/pull/9590#discussion_r583179618




##########
File path: core/src/main/scala/kafka/log/LogCleaner.scala
##########
@@ -701,11 +719,17 @@ private[log] class Cleaner(val id: Int,
       // if any messages are to be retained, write them out
       val outputBuffer = result.outputBuffer
       if (outputBuffer.position() > 0) {
+        if (destSegment.isEmpty) {
+          // create a new segment with a suffix appended to the name of the 
log and indexes
+          destSegment = Some(LogCleaner.createNewCleanedSegment(log, 
result.minOffset()))
+          transactionMetadata.cleanedIndex = Some(destSegment.get.txnIndex)

Review comment:
       Ok, I think I understand. One thing that makes this a little confusing 
is the need to reverse the collection. Why don't just build the index in order?
   
   I am debating whether this is good enough. A potential problem is that we 
don't have a guarantee on the size of the index. In the common case, it should 
be small, but there is nothing to prevent an entire segment from being full of 
aborted transaction markers. Currently we wait until `filterInto` returns 
before creating the new segment, but maybe instead we could do it in 
`checkBatchRetention` after the first retained batch is observed?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] hachikuji commented on a change in pull request #9590: KAFKA-7556: KafkaConsumer.beginningOffsets does not return actual first offsets

Reply via email to