David Lao created KAFKA-1145: -------------------------------- Summary: Broker fail to sync after restart Key: KAFKA-1145 URL: https://issues.apache.org/jira/browse/KAFKA-1145 Project: Kafka Issue Type: Bug Components: replication Affects Versions: 0.8 Reporter: David Lao Assignee: Neha Narkhede Priority: Critical
I'm hitting this issue where a freshly join broker is stuck in a replication loop due to error getting offset. The sequence of events are as follows: 1) broker-0 and broker-3 holds the logs for partition-1. broker-0 was the partition leader. broker-0 when down due to a machine failure (ie lost of log data drive) 2) broker-3 became the leader for partition-1 3) broker-0 joins back after log drive replacement Exceptions observed on broker-0 upon rejoining kafka.common.KafkaStorageException: Deleting log segment 0 failed. at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:613) at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:608) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:33) at kafka.log.Log.deleteSegments(Log.scala:608) at kafka.log.Log.truncateAndStartWithNewOffset(Log.scala:667) at kafka.server.ReplicaFetcherThread.handleOffsetOutOfRange(ReplicaFetcherThread.scala:97) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1$$anonfun$apply$mcV$sp$2.apply(AbstractFetcherThread.scala:142) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1$$anonfun$apply$mcV$sp$2.apply(AbstractFetcherThread.scala:109) at scala.collection.immutable.Map$Map1.foreach(Map.scala:119) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1.apply$mcV$sp(AbstractFetcherThread.scala:109) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1.apply(AbstractFetcherThread.scala:109) at kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1.apply(AbstractFetcherThread.scala:109) at kafka.utils.Utils$.inLock(Utils.scala:565) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:108) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:86) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) logs are attached -- This message was sent by Atlassian JIRA (v6.1#6144)