I've filed: https://issues.apache.org/jira/browse/KAFKA-1108
On Tue, Oct 29, 2013 at 4:29 PM, Jason Rosenberg <j...@squareup.com> wrote: > Here's another exception I see during controlled shutdown (this time there > was not an unclean shutdown problem). Should I be concerned about this > exception? Is any data loss possible with this? This one happened after > the first "Retrying controlled shutdown after the previous attempt > failed..." message. The controlled shutdown subsequently succeeded without > another retry (but with a few more of these exceptions). > > Again, there was no "Remaining partitions to move..." message before the > retrying message, so I assume the retry happens after an IOException (which > is not logged in KafkaServer.controlledShutdown). > > 2013-10-29 20:03:31,883 INFO [kafka-request-handler-4] > controller.ReplicaStateMachine - [Replica state machine on controller 10]: > Invoking state change to OfflineReplica for replicas > PartitionAndReplica(mytopic,0,10) > 2013-10-29 20:03:31,883 ERROR [kafka-request-handler-4] change.logger - > Controller 10 epoch 190 initiated state change of replica 10 for partition > [mytopic,0] to OfflineReplica failed > java.lang.AssertionError: assertion failed: Replica 10 for partition > [mytopic,0] should be in the NewReplica,OnlineReplica states before moving > to OfflineReplica state. Instead it is in OfflineReplica state > at scala.Predef$.assert(Predef.scala:91) > at > kafka.controller.ReplicaStateMachine.assertValidPreviousStates(ReplicaStateMachine.scala:209) > at > kafka.controller.ReplicaStateMachine.handleStateChange(ReplicaStateMachine.scala:167) > at > kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:89) > at > kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:89) > at scala.collection.immutable.Set$Set1.foreach(Set.scala:81) > at > kafka.controller.ReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:89) > at > kafka.controller.KafkaController$$anonfun$shutdownBroker$4$$anonfun$apply$2.apply(KafkaController.scala:199) > at > kafka.controller.KafkaController$$anonfun$shutdownBroker$4$$anonfun$apply$2.apply(KafkaController.scala:184) > at scala.Option.foreach(Option.scala:121) > at > kafka.controller.KafkaController$$anonfun$shutdownBroker$4.apply(KafkaController.scala:184) > at > kafka.controller.KafkaController$$anonfun$shutdownBroker$4.apply(KafkaController.scala:180) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:57) > at > scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:43) > at > kafka.controller.KafkaController.shutdownBroker(KafkaController.scala:180) > at > kafka.server.KafkaApis.handleControlledShutdownRequest(KafkaApis.scala:133) > at kafka.server.KafkaApis.handle(KafkaApis.scala:72) > at > kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:42) > at java.lang.Thread.run(Thread.java:662) > > Jason > > > On Fri, Oct 25, 2013 at 11:51 PM, Jason Rosenberg <j...@squareup.com>wrote: > >> >> >> On Fri, Oct 25, 2013 at 9:16 PM, Joel Koshy <jjkosh...@gmail.com> wrote: >> >>> >>> Unclean shutdown could result in data loss - since you are moving >>> leadership to a replica that has fallen out of ISR. i.e., it's log end >>> offset is behind the last committed message to this partition. >>> >>> >> But if data is written with 'request.required.acks=-1', no data should be >> lost, no? Or will partitions be truncated wholesale after an unclean >> shutdown? >> >> >> >>> >>> Take a look at the packaged log4j.properties file. The controller's >>> partition/replica state machines and its channel manager which >>> sends/receives leaderandisr requests/responses to brokers uses a >>> stateChangeLogger. The replica managers on all brokers also use this >>> logger. >> >> >> Ah.....so it looks like most things logged with the stateChangeLogger are >> logged at the TRACE log level.....and that's the default setting in the >> log4j.properties file. Needless to say, my contained KafkaServer is not >> currently using that log4j.properties (we are just using a rootLogger with >> level = INFO by default). I can probably add a special rule to use TRACE >> for the state.change.logger category. However, I'm not sure I can make it >> so that logging all goes to it's own separate log file..... >> >>> >>> Our logging can improve - e.g., it looks like on controller movement >>> we could retry without saying why. >>> >> >> I can file a jira for this, but I'm not sure what it should say! >> > >