Auto leader balancing has some know issues when using together with controlled shutdown. So, we don't recommend it to be turned on in 0.8.1.1
Thanks, Jun On Wed, Jun 18, 2014 at 1:41 AM, Bongyeon Kim <bongyeon....@gmail.com> wrote: > Yes. > it contain my server.properties file. > > > auto.leader.rebalance.enable=true > > > On Wed, Jun 18, 2014 at 12:44 PM, Jun Rao <jun...@gmail.com> wrote: > > > Did you have auto leader balancing enabled? > > > > Thanks, > > > > Jun > > > > > > On Tue, Jun 17, 2014 at 5:06 PM, Bongyeon Kim <bongyeon....@gmail.com> > > wrote: > > > > > There is some error log about failing leader election like that. > > > > > > > > > [2014-06-18 08:59:21,014] ERROR Controller 7 epoch 4 encountered error > > > while electing leader for partition [topicDEBUG,5] due to: Preferred > > > replica 1 for partition [topicDEBUG,5] is either not alive or not in > the > > > isr. Current leader and ISR: > [{"leader":8,"leader_epoch":6,"isr":[8,2]}]. > > > (state.change.logger) > > > [2014-06-18 08:59:21,014] ERROR Controller 7 epoch 4 initiated state > > change > > > for partition [topicDEBUG,5] from OnlinePartition to OnlinePartition > > failed > > > (state.change.logger) > > > kafka.common.StateChangeFailedException: encountered error while > electing > > > leader for partition [topicDEBUG,5] due to: Preferred replica 1 for > > > partition [topicDEBUG,5] is either not alive or not in the isr. Current > > > leader and ISR: [{"leader":8,"leader_epoch":6,"isr":[8,2]}]. > > > at > > > > > > > > > kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:360) > > > at > > > > > > > > > kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:187) > > > at > > > > > > > > > kafka.controller.PartitionStateMachine$$anonfun$handleStateChanges$2.apply(PartitionStateMachine.scala:125) > > > at > > > > > > > > > kafka.controller.PartitionStateMachine$$anonfun$handleStateChanges$2.apply(PartitionStateMachine.scala:124) > > > at scala.collection.immutable.Set$Set1.foreach(Set.scala:86) > > > at > > > > > > > > > kafka.controller.PartitionStateMachine.handleStateChanges(PartitionStateMachine.scala:124) > > > at > > > > > > > > > kafka.controller.KafkaController.onPreferredReplicaElection(KafkaController.scala:618) > > > at > > > > > > > > > kafka.controller.KafkaController$$anonfun$kafka$controller$KafkaController$$checkAndTriggerPartitionRebalance$4$$anonfun$apply$17$$anonfun$apply$5.apply$mcV$sp(KafkaController.scala:1118) > > > at > > > > > > > > > kafka.controller.KafkaController$$anonfun$kafka$controller$KafkaController$$checkAndTriggerPartitionRebalance$4$$anonfun$apply$17$$anonfun$apply$5.apply(KafkaController.scala:1112) > > > at > > > > > > > > > kafka.controller.KafkaController$$anonfun$kafka$controller$KafkaController$$checkAndTriggerPartitionRebalance$4$$anonfun$apply$17$$anonfun$apply$5.apply(KafkaController.scala:1112) > > > at kafka.utils.Utils$.inLock(Utils.scala:538) > > > at > > > > > > > > > kafka.controller.KafkaController$$anonfun$kafka$controller$KafkaController$$checkAndTriggerPartitionRebalance$4$$anonfun$apply$17.apply(KafkaController.scala:1109) > > > at > > > > > > > > > kafka.controller.KafkaController$$anonfun$kafka$controller$KafkaController$$checkAndTriggerPartitionRebalance$4$$anonfun$apply$17.apply(KafkaController.scala:1107) > > > at > > > > > > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:95) > > > at > > > > > > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:95) > > > at scala.collection.Iterator$class.foreach(Iterator.scala:772) > > > at > > scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:157) > > > at > > > > > > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:190) > > > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:45) > > > at scala.collection.mutable.HashMap.foreach(HashMap.scala:95) > > > at > > > > > > > > > kafka.controller.KafkaController$$anonfun$kafka$controller$KafkaController$$checkAndTriggerPartitionRebalance$4.apply(KafkaController.scala:1107) > > > at > > > > > > > > > kafka.controller.KafkaController$$anonfun$kafka$controller$KafkaController$$checkAndTriggerPartitionRebalance$4.apply(KafkaController.scala:1086) > > > at > scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:178) > > > at > > > > scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:347) > > > at > > > > > > > > > kafka.controller.KafkaController.kafka$controller$KafkaController$$checkAndTriggerPartitionRebalance(KafkaController.scala:1086) > > > at > > > > > > > > > kafka.controller.KafkaController$$anonfun$onControllerFailover$1.apply$mcV$sp(KafkaController.scala:324) > > > at kafka.utils.KafkaScheduler$$anon$1.run(KafkaScheduler.scala:100) > > > at > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > > > at > > > > > > > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > > > at > > > > > > > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > > > at > > > > > > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > > at > > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > > at java.lang.Thread.run(Thread.java:744) > > > Caused by: kafka.common.StateChangeFailedException: Preferred replica 1 > > for > > > partition [topicDEBUG,5] is either not alive or not in the isr. Current > > > leader and ISR: [{"leader":8,"leader_epoch":6,"isr":[8,2]}] > > > at > > > > > > > > > kafka.controller.PreferredReplicaPartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:144) > > > at > > > > > > > > > kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:336) > > > ... 33 more > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Jun 17, 2014 at 12:42 PM, Jun Rao <jun...@gmail.com> wrote: > > > > > > > Any error in the controller and state-change log? > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > > > > > On Mon, Jun 16, 2014 at 6:05 PM, Bongyeon Kim < > bongyeon....@gmail.com> > > > > wrote: > > > > > > > > > Hi, team. > > > > > > > > > > Im using Kafka 0.8.1.1. > > > > > I'm running 8 brokers on 4 machine. (2 brokers on 1 machine) and I > > > have 3 > > > > > topics each have 16 partitions and 3 replicas. > > > > > > > > > > kafka-topics describe is > > > > > > > > > > Topic:topicCDR PartitionCount:16 ReplicationFactor:3 Configs: > > > > retention.ms > > > > > =3600000 > > > > > Topic: topicCDR Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,2 > > > > > Topic: topicCDR Partition: 1 Leader: 4 Replicas: 4,2,3 Isr: 3,4,2 > > > > > Topic: topicCDR Partition: 2 Leader: 5 Replicas: 5,3,4 Isr: 3,4,5 > > > > > Topic: topicCDR Partition: 3 Leader: 6 Replicas: 6,4,5 Isr: 4,5,6 > > > > > Topic: topicCDR Partition: 4 Leader: 7 Replicas: 7,5,6 Isr: 5,6,7 > > > > > Topic: topicCDR Partition: 5 Leader: 8 Replicas: 8,6,7 Isr: 6,7,8 > > > > > Topic: topicCDR Partition: 6 Leader: 1 Replicas: 1,7,8 Isr: 1,7,8 > > > > > Topic: topicCDR Partition: 7 Leader: 2 Replicas: 2,8,1 Isr: 8,2 > > > > > Topic: topicCDR Partition: 8 Leader: 3 Replicas: 3,2,4 Isr: 3,4,2 > > > > > Topic: topicCDR Partition: 9 Leader: 4 Replicas: 4,3,5 Isr: 3,4,5 > > > > > Topic: topicCDR Partition: 10 Leader: 5 Replicas: 5,4,6 Isr: 4,5,6 > > > > > Topic: topicCDR Partition: 11 Leader: 6 Replicas: 6,5,7 Isr: 5,6,7 > > > > > Topic: topicCDR Partition: 12 Leader: 7 Replicas: 7,6,8 Isr: 6,7,8 > > > > > Topic: topicCDR Partition: 13 Leader: 8 Replicas: 8,7,1 Isr: 7,8 > > > > > Topic: topicCDR Partition: 14 Leader: 8 Replicas: 1,8,2 Isr: 8,2 > > > > > Topic: topicCDR Partition: 15 Leader: 2 Replicas: 2,1,3 Isr: 3,2 > > > > > Topic:topicDEBUG PartitionCount:16 ReplicationFactor:3 Configs: > > > > > retention.ms > > > > > =3600000 > > > > > Topic: topicDEBUG Partition: 0 Leader: 4 Replicas: 4,3,5 Isr: 3,4,5 > > > > > Topic: topicDEBUG Partition: 1 Leader: 5 Replicas: 5,4,6 Isr: 4,5,6 > > > > > Topic: topicDEBUG Partition: 2 Leader: 6 Replicas: 6,5,7 Isr: 5,6,7 > > > > > Topic: topicDEBUG Partition: 3 Leader: 7 Replicas: 7,6,8 Isr: 6,7,8 > > > > > Topic: topicDEBUG Partition: 4 Leader: 8 Replicas: 8,7,1 Isr: 7,8 > > > > > Topic: topicDEBUG Partition: 5 Leader: 8 Replicas: 1,8,2 Isr: 8,2 > > > > > Topic: topicDEBUG Partition: 6 Leader: 2 Replicas: 2,1,3 Isr: 3,2 > > > > > Topic: topicDEBUG Partition: 7 Leader: 3 Replicas: 3,2,4 Isr: 3,4,2 > > > > > Topic: topicDEBUG Partition: 8 Leader: 4 Replicas: 4,5,6 Isr: 4,5,6 > > > > > Topic: topicDEBUG Partition: 9 Leader: 5 Replicas: 5,6,7 Isr: 5,6,7 > > > > > Topic: topicDEBUG Partition: 10 Leader: 6 Replicas: 6,7,8 Isr: > 6,7,8 > > > > > Topic: topicDEBUG Partition: 11 Leader: 7 Replicas: 7,8,1 Isr: > 7,8,1 > > > > > Topic: topicDEBUG Partition: 12 Leader: 8 Replicas: 8,1,2 Isr: 8,2 > > > > > Topic: topicDEBUG Partition: 13 Leader: 3 Replicas: 1,2,3 Isr: 3,2 > > > > > Topic: topicDEBUG Partition: 14 Leader: 2 Replicas: 2,3,4 Isr: > 3,4,2 > > > > > Topic: topicDEBUG Partition: 15 Leader: 3 Replicas: 3,4,5 Isr: > 3,4,5 > > > > > Topic:topicTRACE PartitionCount:16 ReplicationFactor:3 Configs: > > > > > retention.ms > > > > > =3600000 > > > > > Topic: topicTRACE Partition: 0 Leader: 5 Replicas: 5,8,1 Isr: 5,8,1 > > > > > Topic: topicTRACE Partition: 1 Leader: 6 Replicas: 6,1,2 Isr: 6,1,2 > > > > > Topic: topicTRACE Partition: 2 Leader: 7 Replicas: 7,2,3 Isr: 3,7,2 > > > > > Topic: topicTRACE Partition: 3 Leader: 8 Replicas: 8,3,4 Isr: 3,4,8 > > > > > Topic: topicTRACE Partition: 4 Leader: 1 Replicas: 1,4,5 Isr: 1,5,4 > > > > > Topic: topicTRACE Partition: 5 Leader: 2 Replicas: 2,5,6 Isr: 5,6,2 > > > > > Topic: topicTRACE Partition: 6 Leader: 3 Replicas: 3,6,7 Isr: 3,6,7 > > > > > Topic: topicTRACE Partition: 7 Leader: 4 Replicas: 4,7,8 Isr: 4,7,8 > > > > > Topic: topicTRACE Partition: 8 Leader: 5 Replicas: 5,1,2 Isr: 5,1,2 > > > > > Topic: topicTRACE Partition: 9 Leader: 6 Replicas: 6,2,3 Isr: 3,6,2 > > > > > Topic: topicTRACE Partition: 10 Leader: 7 Replicas: 7,3,4 Isr: > 3,4,7 > > > > > Topic: topicTRACE Partition: 11 Leader: 8 Replicas: 8,4,5 Isr: > 4,5,8 > > > > > > > > > > > > > > > Problem is one of my topic's ISR is not updating and keep failing > to > > be > > > > > preferred replica. more detail, broker 1 for topicDEBUG's ISR is > not > > > > > updating. > > > > > And log of broker 1 is absolutely normal and has no error. > > > > > > > > > > This is expected situation? what I have to updating this? > > > > > > > > > > > > > > > Thanks in advance. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > *Sincerely* > > > > > *,**Bongyeon Kim* > > > > > > > > > > Java Developer & Engineer > > > > > Seoul, Korea > > > > > Mobile: +82-10-9369-1314 > > > > > Email: bongyeon...@gmail.com > > > > > Twitter: http://twitter.com/tigerby > > > > > Facebook: http://facebook.com/tigerby > > > > > Wiki: http://tigerby.com > > > > > > > > > > > > > > > > > > > > > -- > > > *Sincerely* > > > *,**Bongyeon Kim* > > > > > > Java Developer & Engineer > > > Seoul, Korea > > > Mobile: +82-10-9369-1314 > > > Email: bongyeon...@gmail.com > > > Twitter: http://twitter.com/tigerby > > > Facebook: http://facebook.com/tigerby > > > Wiki: http://tigerby.com > > > > > > > > > -- > *Sincerely* > *,**Bongyeon Kim* > > Java Developer & Engineer > Seoul, Korea > Mobile: +82-10-9369-1314 > Email: bongyeon...@gmail.com > Twitter: http://twitter.com/tigerby > Facebook: http://facebook.com/tigerby > Wiki: http://tigerby.com >