Hi, We have a kafka setup with 6 brokers and topics having replication factor 3 (single partition).
After an improper shutdown, we had corrupted index files on two of our production servers, causing "WARN Found a corrupted index file due to requirement failed: Corrupt index found," messages and kafka shutting down on startup with a "FATAL Exiting Kafka.(kafka.server.KafkaServerStartable)" message. All topics are still accessible, but unfortunately the most important one has only a single ISR left. We decided to clear all kafka data and restart the brokers believing they would fetch all needed data back from the leader to become in-sync again, but on startup we see the following messages in the log (repeating at an alarming rate) WARN [ReplicaFetcherThread-0-3]: Replica 4 for partition <topic>-0 reset its fetch offset from 0 to current leader 3's start offset 0 (kafka.server.ReplicaFetcherThread) ERROR [ReplicaFetcherThread-0-3]: Current offset 0 for partition [<topic>,0] out of range; reset offset to 0 (kafka.server.ReplicaFetcherThread) This looks to me as a similar problem as https://issues.apache.org/jira/browse/KAFKA-6003 While trying to reassign a topic that had lost one of its ISRs (I kept the existing ISRs, but deleted the failing broker and added an existing one) we got the same messages on that existing broker. [2017-11-08 16:21:30,893] WARN [ReplicaFetcherThread-0-1]: Replica 5 for partition <topic>-0 reset its fetch offset from 0 to current leader 1's start offset 0 (kafka.server.ReplicaFetcherThread) [2017-11-08 16:21:30,893] ERROR [ReplicaFetcherThread-0-1]: Current offset 0 for partition [<topic>,0] out of range; reset offset to 0 (kafka.server.ReplicaFetcherThread) This is even more annoying since I don't want to shut down that broker as well and it generates about 800M logs per hour (fortunately only about 100M compressed) Does anybody have a clue what's going on and how to fix it? If the fix in 0.11.0.2 would solve our issue, how soon can we expect the release (if at all) Thanks, Luc