If you hard kill the broker when it restarts it doesn't know the
status of it's on-disk files, it will need to run though the last log
segment to validate the checksums of messages and rebuild the index
off this to ensure consistency. (Why does it need to do this
validation? Because in the event of a server crash filesystems have
very little in the way of general guarantees for unflushed data so
they can legally leave invalid bytes you never wrote in unflushed
portions of the log or index (shocking but true). In your case it
wasn't a server crash, just a process kill, but there is no way to
differentiate so we have to pessimistically do this check on any
restart after an unclean shutdown). This message doesn't indicate
anything bad other than an unclean shutdown occurred (which is why
it's a WARN). I do think maybe that error message is a bit too
alarming, though.

The actual reason for the sanity check failure is that we memory map
fixed size index file chunks which means that the first part of the
index contains real entries and the rest just unfilled zeros, the
unfilled zero at the end is the increment from the base offset of the
segment and the sanity check is validating that that is true. Had that
check not failed, the index still would be rebuilt, I think, since
we're going to run recovery on the segment regardless, but I guess
that that is the check hit first (haven't looked at the sequencing).

-Jay

On Wed, Mar 16, 2016 at 3:17 PM, Scott Reynolds <sreyno...@twilio.com> wrote:
> In a test in staging environment, we kill -9 the broker. It was started
> back up by runit and started recovering. We are seeing errors like this:
>
> WARN Found an corrupted index file,
> /mnt/services/kafka/data/TOPIC-17/00000000000016763460.index, deleting and
> rebuilding index... (kafka.log.Log)
>
> The file is a multiple of 8 (10485760) and has entries.
>
> So this leads me to believe that lastOffset <= baseOffset (
> https://code.hq.twilio.com/data/kafka/blob/35003fd51b80d605d40339dead623012442a92ab/core/src/main/scala/kafka/log/OffsetIndex.scala#L354
> )
>
> Was wondering how that could happen ? Isn't baseOffset taken from the
> filename and therefore is the FIRST entry in the log ? All other entries
> should be greater then that.

Reply via email to