hachikuji commented on a change in pull request #9590:
URL: https://github.com/apache/kafka/pull/9590#discussion_r544593141
##########
File path:
clients/src/main/java/org/apache/kafka/common/record/MemoryRecords.java
##########
@@ -198,7 +198,7 @@ private static FilterResult filterTo(TopicPartition
partition, Iterable<MutableR
if (!retainedRecords.isEmpty()) {
if (writeOriginalBatch) {
batch.writeTo(bufferOutputStream);
- filterResult.updateRetainedBatchMetadata(batch,
retainedRecords.size(), false);
+ filterResult.updateRetainedBatchMetadata(batch,
retainedRecords.get(0).offset(), retainedRecords.size(), false);
Review comment:
I think we're on the right track really. We don't have to implement
exactly what the jira describes. One thing to keep in mind is that the first
record in the log might be a control record which a consumer won't return to
applications anyway. Also note that the log start offset has implications for
replication. Only the records above the log start offset will be replicated to
a new replica, but all of the retained records after cleaning should be
replicated. Hence we already have some constraints which make keeping the log
start offset aligned with the first returnable record difficult.
I would just say that we define and document a reasonable invariant. I think
it should be clear that we need to ensure the segment base offset is less than
or equal to the base offset of the first batch contained in it. The invariant I
would suggest is that the log start offset should be less than or equal to the
end offset of the first batch that can be returned from `Fetch`. For the case
of log cleaning, the simplest thing to do is keep the log start offset aligned
with the base offset of the first batch. In the case of `DeleteRecords`, the
log start offset will point to some offset within the first batch.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]