Thanks Ismael, I hope this will solve the issue in the future. For now, unless we find a workaround soon, we'll probably backup the data we have and rebuild the cluster from scratch...
Luc -----Original Message----- From: isma...@gmail.com [mailto:isma...@gmail.com] On Behalf Of Ismael Juma Sent: woensdag 8 november 2017 16:57 To: Kafka Users <users@kafka.apache.org> Subject: [Possibly spoofed] Re: 0.11.0.1: ReplicaFetcherThread exceptions after clearing corrupted broker data and restarting Hi Luc, The first RC for 0.11.0.2 will be released this week. Ismael On Wed, Nov 8, 2017 at 3:33 PM, Vanlerberghe, Luc < luc.vanlerber...@bvdinfo.com> wrote: > Hi, > > We have a kafka setup with 6 brokers and topics having replication > factor > 3 (single partition). > > After an improper shutdown, we had corrupted index files on two of our > production servers, causing "WARN Found a corrupted index file due to > requirement failed: Corrupt index found," messages and kafka shutting > down on startup with a "FATAL Exiting > Kafka.(kafka.server.KafkaServerStartable)" > message. > > All topics are still accessible, but unfortunately the most important > one has only a single ISR left. > > We decided to clear all kafka data and restart the brokers believing > they would fetch all needed data back from the leader to become > in-sync again, but on startup we see the following messages in the log > (repeating at an alarming rate) WARN [ReplicaFetcherThread-0-3]: > Replica 4 for partition <topic>-0 reset its fetch offset from 0 to > current leader 3's start offset 0 (kafka.server. > ReplicaFetcherThread) > ERROR [ReplicaFetcherThread-0-3]: Current offset 0 for partition > [<topic>,0] out of range; reset offset to 0 (kafka.server. > ReplicaFetcherThread) > > This looks to me as a similar problem as https://issues.apache.org/ > jira/browse/KAFKA-6003 > > While trying to reassign a topic that had lost one of its ISRs (I kept > the existing ISRs, but deleted the failing broker and added an > existing one) we got the same messages on that existing broker. > > [2017-11-08 16:21:30,893] WARN [ReplicaFetcherThread-0-1]: Replica 5 > for partition <topic>-0 reset its fetch offset from 0 to current > leader 1's start offset 0 (kafka.server.ReplicaFetcherThread) > [2017-11-08 16:21:30,893] ERROR [ReplicaFetcherThread-0-1]: Current > offset > 0 for partition [<topic>,0] out of range; reset offset to 0 (kafka.server. > ReplicaFetcherThread) > > This is even more annoying since I don't want to shut down that broker > as well and it generates about 800M logs per hour (fortunately only > about 100M > compressed) > > Does anybody have a clue what's going on and how to fix it? > If the fix in 0.11.0.2 would solve our issue, how soon can we expect > the release (if at all) > > Thanks, > > Luc > >