[ https://issues.apache.org/jira/browse/KAFKA-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Gustafson resolved KAFKA-6388. ------------------------------------ Resolution: Fixed Fix Version/s: 2.0.2 2.1.1 > Error while trying to roll a segment that already exists > -------------------------------------------------------- > > Key: KAFKA-6388 > URL: https://issues.apache.org/jira/browse/KAFKA-6388 > Project: Kafka > Issue Type: Bug > Components: log > Affects Versions: 1.0.0 > Reporter: David Hay > Priority: Blocker > Fix For: 2.1.1, 2.0.2 > > > Recreating this issue from KAFKA-654 as we've been hitting it repeatedly in > our attempts to get a stable 1.0 cluster running (upgrading from 0.8.2.2). > After spending 30 min or more spewing log messages like this: > {noformat} > [2017-12-19 16:44:28,998] INFO Replica loaded for partition > screening.save.results.screening.save.results.processor.error-43 with initial > high watermark 0 (kafka.cluster.Replica) > {noformat} > Eventually, the replica thread throws the error below (also referenced in the > original issue). If I remove that partition from the data directory and > bounce the broker, it eventually rebalances (assuming it doesn't hit a > different partition with the same error). > {noformat} > 2017-12-19 15:16:24,227] WARN Newly rolled segment file > 00000000000000000002.log already exists; deleting it first (kafka.log.Log) > [2017-12-19 15:16:24,227] WARN Newly rolled segment file > 00000000000000000002.index already exists; deleting it first (kafka.log.Log) > [2017-12-19 15:16:24,227] WARN Newly rolled segment file > 00000000000000000002.timeindex already exists; deleting it first > (kafka.log.Log) > [2017-12-19 15:16:24,232] INFO [ReplicaFetcherManager on broker 2] Removed > fetcher for partitions __consumer_offsets-20 > (kafka.server.ReplicaFetcherManager) > [2017-12-19 15:16:24,297] ERROR [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Error due to (kafka.server.ReplicaFetcherThread) > kafka.common.KafkaException: Error processing data for partition > sr.new.sr.new.processor.error-38 offset 2 > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:204) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:172) > at scala.Option.foreach(Option.scala:257) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:172) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1.apply(AbstractFetcherThread.scala:169) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply$mcV$sp(AbstractFetcherThread.scala:169) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:169) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2.apply(AbstractFetcherThread.scala:169) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:217) > at > kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:167) > at > kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:113) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64) > Caused by: kafka.common.KafkaException: Trying to roll a new log segment for > topic partition sr.new.sr.new.processor.error-38 with start offset 2 while it > already exists. > at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1338) > at kafka.log.Log$$anonfun$roll$2.apply(Log.scala:1297) > at kafka.log.Log.maybeHandleIOException(Log.scala:1669) > at kafka.log.Log.roll(Log.scala:1297) > at kafka.log.Log.kafka$log$Log$$maybeRoll(Log.scala:1284) > at kafka.log.Log$$anonfun$append$2.apply(Log.scala:710) > at kafka.log.Log$$anonfun$append$2.apply(Log.scala:624) > at kafka.log.Log.maybeHandleIOException(Log.scala:1669) > at kafka.log.Log.append(Log.scala:624) > at kafka.log.Log.appendAsFollower(Log.scala:607) > at > kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:102) > at > kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:41) > at > kafka.server.AbstractFetcherThread$$anonfun$processFetchRequest$2$$anonfun$apply$mcV$sp$1$$anonfun$apply$2.apply(AbstractFetcherThread.scala:184) > ... 13 more > [2017-12-19 15:16:24,302] INFO [ReplicaFetcher replicaId=2, leaderId=1, > fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)