[ 
https://issues.apache.org/jira/browse/KAFKA-669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede resolved KAFKA-669.
---------------------------------

    Resolution: Implemented
      Assignee: Neha Narkhede
    
> Irrecoverable error on leader while rolling to a new segment
> ------------------------------------------------------------
>
>                 Key: KAFKA-669
>                 URL: https://issues.apache.org/jira/browse/KAFKA-669
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Blocker
>              Labels: bugs
>             Fix For: 0.8
>
>
> I setup a test cluster of 5 brokers and setup migration tool to send data for 
> ~1000 partitions to this cluster. I see  a bunch of 
> FailedToSendMessageExceptions on the producer. After tracing some of these 
> failed requests, I found that the broker sends a ProducerResponseStatus(-1, 
> -1) to the producer. The impact is the producer marks the messages for those 
> partitions as failed, retries and eventually gives up. This seems like a bug 
> in the truncation logic. I traced the state changes topic SpamReportEvent and 
> seems like first it was the follower, so it truncated its log. Then it became 
> the leader and got into an irrecoverable error state due to log rolling. 
> [2012-12-12 01:47:01,951] INFO [Log Manager on Broker 3] Loading log 
> 'SpamReportEvent-0' (kafka.log.LogManager)
> [2012-12-12 01:47:01,951] INFO Creating log segment 
> /mnt/u001/kafka_08_long_running_test/kafka-logs/SpamReportEvent-0/00000000000000000000.log
>  (kafka.log.FileMessageSet)
> [2012-12-12 01:47:01,951] INFO Created index file 
> /mnt/u001/kafka_08_long_running_test/kafka-logs/SpamReportEvent-0/00000000000000000000.index
>  with maxEntries = 1310720, maxIndexSize = 10485760, entries = 1310720, 
> lastOffset = 0 (kafka.log.OffsetIndex)
> [2012-12-12 01:47:02,949] INFO Replica Manager on Broker 3: Starting the 
> follower state transition to follow leader 4 for topic SpamReportEvent 
> partition 0 (kafka.server.ReplicaManager)
> [2012-12-12 01:47:02,950] INFO [Kafka Log on Broker 3], Truncated log segment 
> /mnt/u001/kafka_08_long_running_test/kafka-logs/SpamReportEvent-0/00000000000000000000.log
>  to target offset 0 (kafka.log.Log)
> [2012-12-12 01:47:34,745] INFO Replica Manager on Broker 3: Becoming Leader 
> for topic [SpamReportEvent] partition [0] (kafka.server.ReplicaManager)
> [2012-12-12 01:47:34,745] INFO Replica Manager on Broker 3: Completed the 
> leader state transition for topic SpamReportEvent partition 0 
> (kafka.server.ReplicaManager)
> [2012-12-12 01:47:36,071] INFO Replica Manager on Broker 3: Becoming Leader 
> for topic [SpamReportEvent] partition [0] (kafka.server.ReplicaManager)
> [2012-12-12 01:47:36,071] INFO Replica Manager on Broker 3: Completed the 
> leader state transition for topic SpamReportEvent partition 0 
> (kafka.server.ReplicaManager)
> [2012-12-12 01:48:10,545] INFO Replica Manager on Broker 3: Becoming Leader 
> for topic [SpamReportEvent] partition [0] (kafka.server.ReplicaManager)
> [2012-12-12 01:48:10,545] INFO Replica Manager on Broker 3: Completed the 
> leader state transition for topic SpamReportEvent partition 0 
> (kafka.server.ReplicaManager)
> [2012-12-12 02:02:08,148] ERROR [KafkaApi-3] Error processing ProducerRequest 
> on SpamReportEvent:0 (kafka.server.Kafka
> Apis)
> [2012-12-12 02:06:06,081] ERROR [KafkaApi-3] Error processing ProducerRequest 
> on SpamReportEvent:0 (kafka.server.KafkaApis)
> kafka.common.KafkaException: Trying to roll a new log segment for topic 
> partition SpamReportEvent-0 with start offset 0 while it already exists
>         at kafka.log.Log.rollToOffset(Log.scala:465)
>         at kafka.log.Log.roll(Log.scala:443)
>         at kafka.log.Log.maybeRoll(Log.scala:431)
>         at kafka.log.Log.append(Log.scala:258)
>         at 
> kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:246)
>         at 
> kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:239)
>         at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
>         at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
>         at scala.collection.immutable.Map$Map1.foreach(Map.scala:105)
>         at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
>         at scala.collection.immutable.Map$Map1.map(Map.scala:93)
>         at kafka.server.KafkaApis.appendToLocalLog(KafkaApis.scala:239)
>         at kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:179)
>         at kafka.server.KafkaApis.handle(KafkaApis.scala:56)
>         at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:41)
>         at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to