dhruvilshah3 opened a new pull request #10388:
URL: https://github.com/apache/kafka/pull/10388


   When we find a `.swap` file on startup, we typically want to rename and 
replace it as `.log`, `.index`, `.timeindex`, etc. as a way to complete any 
ongoing replace operations. These swap files are usually known to have been 
flushed to disk before the replace operation begins.
   
   One flaw in the current logic is that we recover these swap files on startup 
and as part of that, end up truncating the producer state and rebuild it from 
scratch. This is unneeded as the replace operation does not mutate the producer 
state by itself. It is only meant to replace the `.log` file along with 
corresponding indices. Because of this unneeded producer state rebuild 
operation, we have seen multi-hour startup times for clusters that have large 
compacted topics.
   
   This patch fixes the issue by doing a sanity check of all records in the 
segment to swap and rebuilds corresponding indices without mutating the 
producer state. Similarly, we also rebuild indices without truncating the 
producer state when we find a missing or corrupted index in the middle of the 
log.
   
   The patch also adds an extra sanity check to detect invalid bytes at the end 
of swap segments. Before this patch, we would truncate invalid bytes from the 
swap segment which could leave us with holes in the log. Because this is an 
unexpected scenario, we now raise an exception in such cases which will fail 
the broker on startup.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to