hachikuji commented on a change in pull request #9838:
URL: https://github.com/apache/kafka/pull/9838#discussion_r553502329



##########
File path: core/src/main/scala/kafka/log/Log.scala
##########
@@ -2124,12 +2132,14 @@ class Log(@volatile private var _dir: File,
           time = time,
           initFileSize = initFileSize,
           preallocate = config.preallocate))
-        updateLogEndOffset(newOffset)
         leaderEpochCache.foreach(_.clearAndFlush())
 
-        producerStateManager.truncate()
-        producerStateManager.updateMapEndOffset(newOffset)
+        // Clear producer state prior to adjusting log start and end offsets
+        // in case the first unstable offset is invalidated by the truncation.
+        producerStateManager.truncateFullyAndStartAt(newOffset)
         maybeIncrementFirstUnstableOffset()
+
+        updateLogEndOffset(newOffset)
         updateLogStartOffset(newOffset)

Review comment:
       One note here: in the initial patch, I reordered `updateLogStartOffset` 
before `updateLogEndOffset`. This caused the  test case 
`ReassignPartitionsIntegrationTest.testHighWaterMarkAfterPartitionReassignment` 
to fail. The problem is the logic in `updateHighWatermark`, which is called in 
`updateLogStartOffset`, assumes that the log end offset has already been set 
correctly. The result of reversing the order was that the high watermark was 
left at a value lower than the log start offset. 
   
   Frankly the test case is a bit contrived. It calls `truncateFullyAndStartAt` 
on the leader before starting a reassignment. I don't think there are any valid 
operations that could have the same effect. At the same time, it feels brittle 
to rely on tricky ordering assumptions like this, so I'm wondering whether we 
can improve this. One idea for the case of `truncateFullyAndStartAt` is to 
bypass `updateLogEndOffset` and `updateLogStartOffset` and update the values 
directly. Any thoughts?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to