[ 
https://issues.apache.org/jira/browse/KAFKA-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359954#comment-14359954
 ] 

K Zakee commented on KAFKA-2011:
--------------------------------

Thanks Jiangjie.

Though the ZK client session timeout has stopped the controller re-elections.  
On digging deeper, I found that controller elected log shows slightly before 
the ZK session timeout logs like below is one example:

[2015-03-11 04:28:14,435] INFO Client session timed out, have not heard from 
server in 34105ms for sessionid 0x24bf1b6f5310075, closing socket connection 
and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2015-03-11 04:27:48,007] INFO 1 successfully elected as leader 
(kafka.server.ZookeeperLeaderElector)

I am wondering if ZK Session timeout caused the controller election 
(re-election), why logs depicting the other way around. 

> Rebalance with auto.leader.rebalance.enable=false 
> --------------------------------------------------
>
>                 Key: KAFKA-2011
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2011
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.2.0
>         Environment: 5 Hosts of below config:
> "x86_64" "32-bit, 64-bit" "Little Endian" "24 GenuineIntel CPUs Model 44 
> 1600.000MHz" "RAM 189 GB" GNU/Linux
>            Reporter: K Zakee
>            Priority: Blocker
>         Attachments: controller-logs-1.zip, controller-logs-2.zip
>
>
> Started with clean cluster 0.8.2 with 5 brokers. Setting the properties as 
> below:
> auto.leader.rebalance.enable=false
> controlled.shutdown.enable=true
> controlled.shutdown.max.retries=1
> controlled.shutdown.retry.backoff.ms=5000
> default.replication.factor=3
> log.cleaner.enable=true
> log.cleaner.threads=5
> log.cleanup.policy=delete
> log.flush.scheduler.interval.ms=3000
> log.retention.minutes=1440
> log.segment.bytes=1073741824
> message.max.bytes=1000000
> num.io.threads=14
> num.network.threads=14
> num.partitions=10
> queued.max.requests=500
> num.replica.fetchers=4
> replica.fetch.max.bytes=1048576
> replica.fetch.min.bytes=51200
> replica.lag.max.messages=5000
> replica.lag.time.max.ms=30000
> replica.fetch.wait.max.ms=1000
> fetch.purgatory.purge.interval.requests=5000
> producer.purgatory.purge.interval.requests=5000
> delete.topic.enable=true
> Logs show rebalance happening well up to 24 hours after the start.
> [2015-03-07 16:52:48,969] INFO [Controller 2]: Resuming preferred replica 
> election for partitions:  (kafka.controller.KafkaController)
> [2015-03-07 16:52:48,969] INFO [Controller 2]: Partitions that completed 
> preferred replica election:  (kafka.controller.KafkaController)
> …
> [2015-03-07 12:07:06,783] INFO [Controller 4]: Resuming preferred replica 
> election for partitions:  (kafka.controller.KafkaController)
> ...
> [2015-03-07 09:10:41,850] INFO [Controller 3]: Resuming preferred replica 
> election for partitions:  (kafka.controller.KafkaController)
> ...
> [2015-03-07 08:26:56,396] INFO [Controller 1]: Starting preferred replica 
> leader election for partitions  (kafka.controller.KafkaController)
> ...
> [2015-03-06 16:52:59,506] INFO [Controller 2]: Partitions undergoing 
> preferred replica election:  (kafka.controller.KafkaController)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to