[jira] [Commented] (KAFKA-2011) Rebalance with auto.leader.rebalance.enable=false

K Zakee (JIRA) Tue, 10 Mar 2015 16:33:06 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355936#comment-14355936
 ]


K Zakee commented on KAFKA-2011:
--------------------------------

The preferred replica leader election occurred again today, and below logs I 
see during that time. 
---------------------------
[2015-03-10 13:51:08,834] INFO Client session timed out, have not heard from 
server in 16526ms for sessionid 0x24bf1b6f531006b, closing socket connection 
and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2015-03-10 13:51:09,414] INFO Socket connection established to {host/ip}:2181, 
initiating session (org.apache.zookeeper.ClientCnxn)
[2015-03-10 13:51:09,415] INFO Unable to reconnect to ZooKeeper service, 
session 0x24bf1b6f531006b has expired, closing socket connection 
(org.apache.zookeeper.ClientCnxn)
[2015-03-10 13:51:09,458] INFO Socket connection established to {host/ip}:2181, 
initiating session (org.apache.zookeeper.ClientCnxn)
[2015-03-10 13:51:09,730] INFO Session establishment complete on server 
{host/ip}:2181, sessionid = 0x14bf1b6f53301b7, negotiated timeout = 6000 
(org.apache.zookeeper.ClientCnxn)
---------------------------

Not sure why ZK session time out should trigger controller migration. 

> Rebalance with auto.leader.rebalance.enable=false 
> --------------------------------------------------
>
>                 Key: KAFKA-2011
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2011
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.2.0
>         Environment: 5 Hosts of below config:
> "x86_64" "32-bit, 64-bit" "Little Endian" "24 GenuineIntel CPUs Model 44 
> 1600.000MHz" "RAM 189 GB" GNU/Linux
>            Reporter: K Zakee
>            Priority: Blocker
>         Attachments: controller-logs-1.zip, controller-logs-2.zip
>
>
> Started with clean cluster 0.8.2 with 5 brokers. Setting the properties as 
> below:
> auto.leader.rebalance.enable=false
> controlled.shutdown.enable=true
> controlled.shutdown.max.retries=1
> controlled.shutdown.retry.backoff.ms=5000
> default.replication.factor=3
> log.cleaner.enable=true
> log.cleaner.threads=5
> log.cleanup.policy=delete
> log.flush.scheduler.interval.ms=3000
> log.retention.minutes=1440
> log.segment.bytes=1073741824
> message.max.bytes=1000000
> num.io.threads=14
> num.network.threads=14
> num.partitions=10
> queued.max.requests=500
> num.replica.fetchers=4
> replica.fetch.max.bytes=1048576
> replica.fetch.min.bytes=51200
> replica.lag.max.messages=5000
> replica.lag.time.max.ms=30000
> replica.fetch.wait.max.ms=1000
> fetch.purgatory.purge.interval.requests=5000
> producer.purgatory.purge.interval.requests=5000
> delete.topic.enable=true
> Logs show rebalance happening well up to 24 hours after the start.
> [2015-03-07 16:52:48,969] INFO [Controller 2]: Resuming preferred replica 
> election for partitions:  (kafka.controller.KafkaController)
> [2015-03-07 16:52:48,969] INFO [Controller 2]: Partitions that completed 
> preferred replica election:  (kafka.controller.KafkaController)
> …
> [2015-03-07 12:07:06,783] INFO [Controller 4]: Resuming preferred replica 
> election for partitions:  (kafka.controller.KafkaController)
> ...
> [2015-03-07 09:10:41,850] INFO [Controller 3]: Resuming preferred replica 
> election for partitions:  (kafka.controller.KafkaController)
> ...
> [2015-03-07 08:26:56,396] INFO [Controller 1]: Starting preferred replica 
> leader election for partitions  (kafka.controller.KafkaController)
> ...
> [2015-03-06 16:52:59,506] INFO [Controller 2]: Partitions undergoing 
> preferred replica election:  (kafka.controller.KafkaController)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2011) Rebalance with auto.leader.rebalance.enable=false

Reply via email to