[ https://issues.apache.org/jira/browse/KAFKA-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099943#comment-16099943 ]
Ismael Juma commented on KAFKA-5634: ------------------------------------ [~hachikuji], restarting the follower is not enough as the thread will die again, right? The simplest way is probably to just delete all the data in the follower so that it re-replicates (assuming that re-replication is not too costly). > Replica fetcher thread crashes due to OffsetOutOfRangeException > --------------------------------------------------------------- > > Key: KAFKA-5634 > URL: https://issues.apache.org/jira/browse/KAFKA-5634 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.11.0.0 > Reporter: Jason Gustafson > Assignee: Jason Gustafson > Priority: Critical > Labels: regression, reliability > Fix For: 0.11.0.1 > > > We have seen the following exception recently: > {code} > kafka.common.KafkaException: error processing data for partition [foo,0] > offset 1459250 > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:203) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:174) > at scala.Option.foreach(Option.scala:257) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:174) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:171) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:171) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:171) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:171) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213) > at > kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:169) > at > kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:112) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64) > Caused by: org.apache.kafka.common.errors.OffsetOutOfRangeException: The > specified offset 1459250 is higher than the high watermark 1459032 of the > partition foo-0 > {code} > The error check was added in the patch for KIP-107: > https://github.com/apache/kafka/commit/8b05ad406d4cba6a75d1683b6d8699c3ab28f9d6. > After investigation, we found that it is possible for the log start offset > on the leader to get ahead of the high watermark on the follower after > segment deletion. The check therefore seems incorrect. The impact of this bug > is that the fetcher thread crashes on the follower and the broker must be > restarted. -- This message was sent by Atlassian JIRA (v6.4.14#64029)