[ https://issues.apache.org/jira/browse/KAFKA-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359549#comment-14359549 ]
Jiangjie Qin commented on KAFKA-2011: ------------------------------------- On the server side, I would suggest keeping auto leader rebalance on, as it help with workload balance. That said, consumer and producer will take care of NotLeaderForPartitionException and retry. > Rebalance with auto.leader.rebalance.enable=false > -------------------------------------------------- > > Key: KAFKA-2011 > URL: https://issues.apache.org/jira/browse/KAFKA-2011 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.8.2.0 > Environment: 5 Hosts of below config: > "x86_64" "32-bit, 64-bit" "Little Endian" "24 GenuineIntel CPUs Model 44 > 1600.000MHz" "RAM 189 GB" GNU/Linux > Reporter: K Zakee > Priority: Blocker > Attachments: controller-logs-1.zip, controller-logs-2.zip > > > Started with clean cluster 0.8.2 with 5 brokers. Setting the properties as > below: > auto.leader.rebalance.enable=false > controlled.shutdown.enable=true > controlled.shutdown.max.retries=1 > controlled.shutdown.retry.backoff.ms=5000 > default.replication.factor=3 > log.cleaner.enable=true > log.cleaner.threads=5 > log.cleanup.policy=delete > log.flush.scheduler.interval.ms=3000 > log.retention.minutes=1440 > log.segment.bytes=1073741824 > message.max.bytes=1000000 > num.io.threads=14 > num.network.threads=14 > num.partitions=10 > queued.max.requests=500 > num.replica.fetchers=4 > replica.fetch.max.bytes=1048576 > replica.fetch.min.bytes=51200 > replica.lag.max.messages=5000 > replica.lag.time.max.ms=30000 > replica.fetch.wait.max.ms=1000 > fetch.purgatory.purge.interval.requests=5000 > producer.purgatory.purge.interval.requests=5000 > delete.topic.enable=true > Logs show rebalance happening well up to 24 hours after the start. > [2015-03-07 16:52:48,969] INFO [Controller 2]: Resuming preferred replica > election for partitions: (kafka.controller.KafkaController) > [2015-03-07 16:52:48,969] INFO [Controller 2]: Partitions that completed > preferred replica election: (kafka.controller.KafkaController) > … > [2015-03-07 12:07:06,783] INFO [Controller 4]: Resuming preferred replica > election for partitions: (kafka.controller.KafkaController) > ... > [2015-03-07 09:10:41,850] INFO [Controller 3]: Resuming preferred replica > election for partitions: (kafka.controller.KafkaController) > ... > [2015-03-07 08:26:56,396] INFO [Controller 1]: Starting preferred replica > leader election for partitions (kafka.controller.KafkaController) > ... > [2015-03-06 16:52:59,506] INFO [Controller 2]: Partitions undergoing > preferred replica election: (kafka.controller.KafkaController) -- This message was sent by Atlassian JIRA (v6.3.4#6332)