[ https://issues.apache.org/jira/browse/KAFKA-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146757#comment-17146757 ]
Johnny Malizia commented on KAFKA-10207: ---------------------------------------- I didn't see where to assign this to me, but I went ahead and submitted a change to handle this scenario more gracefully. [https://github.com/apache/kafka/pull/8936] With that said, it does seem to go directly against part of the benefits of KIP-263, so I am open to alternatives. > Untrimmed Index files cause premature log segment deletions on startup > ---------------------------------------------------------------------- > > Key: KAFKA-10207 > URL: https://issues.apache.org/jira/browse/KAFKA-10207 > Project: Kafka > Issue Type: Bug > Components: log > Affects Versions: 2.4.0, 2.3.1, 2.4.1 > Reporter: Johnny Malizia > Priority: Major > > [KIP-263|https://cwiki.apache.org/confluence/display/KAFKA/KIP-263%3A+Allow+broker+to+skip+sanity+check+of+inactive+segments+on+broker+startup#KIP263:Allowbrokertoskipsanitycheckofinactivesegmentsonbrokerstartup-Evaluation] > appears to have introduced a change explicitly deciding to not call the > sanityCheck method on the time or offset index files that are loaded by Kafka > at startup. I found a particularly nasty bug using the following configuration > {code:java} > jvm=1.8.0_191 zfs=0.6.5.6 kernel=4.4.0-1013-aws kafka=2.4.1{code} > The bug was that the retention period for a topic or even the broker level > configuration seemed to not be respected, no matter what, when the broker > started up it would decide that all log segments on disk were breaching the > retention window and the data would be purged away. > > {code:java} > Found deletable segments with base offsets [11610665,12130396,12650133] due > to retention time 86400000ms breach {code} > {code:java} > Rolled new log segment at offset 12764291 in 1 ms. (kafka.log.Log) > Scheduling segments for deletion List(LogSegment(baseOffset=11610665, > size=1073731621, lastModifiedTime=1592532125000, largestTime=0), > LogSegment(baseOffset=12130396, size=1073727967, > lastModifiedTime=1592532462000, largestTime=0), > LogSegment(baseOffset=12650133, size=235891971, > lastModifiedTime=1592532531000, largestTime=0)) {code} > Further logging showed that this issue was happening when loading the files, > indicating the final writes to trim the index were not successful > {code:java} > DEBUG Loaded index file > /mnt/kafka-logs/test_topic-0/00000000000017221277.timeindex with maxEntries = > 873813, maxIndexSize = 10485760, entries = 873813, lastOffset = > TimestampOffset(0,17221277), file position = 10485756 > (kafka.log.TimeIndex){code} > It looks like the initially file is preallocated (10MB by default) and index > entries are added over time. When it's time to roll to a new log segment, the > index file is supposed to be trimmed, removing any 0 bytes left at the tail > from the initial allocation. But, in some cases that doesn't seem to happen > successfully. Because 0 bytes at the tail may not have been removed, when the > index is loaded again after restarting Kafka, the buffer seeks the position > to the end and the next timestamp is 0 and this leads to a premature TTL > deletion of the log segments. > > I tracked the issue down to being caused by the jvm version being used as > upgrading resolved this issue, but I think that Kafka should never delete > data by mistake like this as doing a rolling restart with this bug in place > would cause complete data-loss across the cluster. > -- This message was sent by Atlassian Jira (v8.3.4#803005)