Hi Ted, The change you mention is not part of 0.11.0.2.
Ismael On Sat, Jan 6, 2018 at 3:31 PM, Ted Yu <yuzhih...@gmail.com> wrote: > bq. WARN Found a corrupted index file due to requirement failed: Corrupt > index found, index file > (/data/kafka/data-processed-15/00000000000054942918.index) > > Can you search backward for 00000000000054942918.index in the log to see if > we can find the cause for corruption ? > > This part of code was recently changed by : > > KAFKA-6324; Change LogSegment.delete to deleteIfExists and harden log > recovery > > Cheers > > On Sat, Jan 6, 2018 at 7:18 AM, Vincent Rischmann <vinc...@rischmann.fr> > wrote: > > > Here's an excerpt just after the broker started: > > https://pastebin.com/tZqze4Ya > > > > After more than 8 hours of recovery the broker finally started. I haven't > > read through all 8 hours of log but the parts I looked at are like the > > pastebin. > > > > I'm not seeing much in the log cleaner logs either, they look normal. We > > have a couple of compacted topics but seems only the consumer offsets is > > ever compacted (the other topics don't have much traffic). > > > > On Sat, Jan 6, 2018, at 12:02 AM, Brett Rann wrote: > > > What do the broker logs say its doing during all that time? > > > > > > There are some consumer offset / log cleaner bugs which caused us > > similarly > > > log delays. that was easily visible by watching the log cleaner > activity > > in > > > the logs, and in our monitoring of partition sizes watching them go > down, > > > along with IO activity on the host for those files. > > > > > > On Sat, Jan 6, 2018 at 7:48 AM, Vincent Rischmann < > vinc...@rischmann.fr> > > > wrote: > > > > > > > Hello, > > > > > > > > so I'm upgrading my brokers from 0.10.1.1 to 0.11.0.2 to fix this bug > > > > https://issues.apache.org/jira/browse/KAFKA-4523 > > > > <https://issues.apache.org/jira/browse/KAFKA-4523> > > > > Unfortunately while stopping one broker, it crashed exactly because > of > > > > this bug. No big deal usually, except after restarting Kafka in > > 0.11.0.2 > > > > the recovery is taking a really long time. > > > > I have around 6TB of data on that broker, and before when it crashed > it > > > > usually took around 30 to 45 minutes to recover, but now I'm at > almost > > > > 5h since Kafka started and it's still not recovered. > > > > I'm wondering what could have changed to have such a dramatic effect > on > > > > recovery time ? Is there maybe something I can tweak to try to reduce > > > > the time ? > > > > Thanks. > > > > > > >