[ https://issues.apache.org/jira/browse/KAFKA-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13833374#comment-13833374 ]
Jun Rao commented on KAFKA-1145: -------------------------------- Thanks for reporting this. Does the same issue exist in trunk? > Broker fail to sync after restart > --------------------------------- > > Key: KAFKA-1145 > URL: https://issues.apache.org/jira/browse/KAFKA-1145 > Project: Kafka > Issue Type: Bug > Components: replication > Affects Versions: 0.8 > Reporter: David Lao > Assignee: Neha Narkhede > Priority: Critical > Attachments: broker-0.txt, broker-3.txt > > > I'm hitting this issue where a freshly join broker is stuck in a replication > loop due to error getting offset. > The sequence of events are as follows: > 1) broker-0 and broker-3 holds the logs for partition-1. broker-0 was the > partition leader. broker-0 when down due to a machine failure (ie lost of > log data drive) > 2) broker-3 became the leader for partition-1 > 3) broker-0 joins back after log drive replacement > Exceptions observed on broker-0 upon rejoining > kafka.common.KafkaStorageException: Deleting log segment 0 failed. > at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:613) > at kafka.log.Log$$anonfun$deleteSegments$1.apply(Log.scala:608) > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34) > at > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:33) > at kafka.log.Log.deleteSegments(Log.scala:608) > at kafka.log.Log.truncateAndStartWithNewOffset(Log.scala:667) > at > kafka.server.ReplicaFetcherThread.handleOffsetOutOfRange(ReplicaFetcherThread.scala:97) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1$$anonfun$apply$mcV$sp$2.apply(AbstractFetcherThread.scala:142) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1$$anonfun$apply$mcV$sp$2.apply(AbstractFetcherThread.scala:109) > at scala.collection.immutable.Map$Map1.foreach(Map.scala:119) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1.apply$mcV$sp(AbstractFetcherThread.scala:109) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1.apply(AbstractFetcherThread.scala:109) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$1.apply(AbstractFetcherThread.scala:109) > at kafka.utils.Utils$.inLock(Utils.scala:565) > at > kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:108) > at > kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:86) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) > logs are attached -- This message was sent by Atlassian JIRA (v6.1#6144)