Maybe someone already answered this…but you can use the repartitioner to fix 
that (it’s included with Kafka)

As far as root cause, you probably had a few leader elections due to excessive 
latency.  There is a cascading scenario that I noticed Kafka is vulnerable to.  
The events transpire as follows:

1.) network latency occurs due to Random acts of God.
2.) This triggers leader election for some partitions.
3.) Because the new leaders now have more work to do, this may introduce more 
latency which will trigger more leader elections…wash and repeat.
4.) Now we have one ring to rule them…one ring to find them, one ring to bring 
them all and in the darkness bind them.

I’m not sure how much new leader election would trigger disk reads (possibly 
because the leader may not have the newly served data in the OS disk cache)…but 
that may be a source for more latency (perhaps a more learned voice can chime 
in here)

In any case, you should probably have some kind of alerting on leadership 
distribution.  And hedge against this scenario with beefy machines. 

-David

On 10/7/16, 2:49 AM, "Misra, Rahul" <rahul.mi...@altisource.com> wrote:

    Hi,
    
    I have been using a 3 node kafka cluster for development for some time now. 
I have created some topics on this cluster. Yesterday I observed the following 
when I used 'describe' for the topics:
    The Kafka version I'm using is: 9.0.1 (kafka_2.11-0.9.0.1).
    
    Topic:topicIc  PartitionCount:3        ReplicationFactor:3     Configs:
            Topic: topicIc Partition: 0    Leader: 0       Replicas: 1,2,0 Isr: 
0,2,1
            Topic: topicIc Partition: 1    Leader: 0       Replicas: 2,0,1 Isr: 
0,2,1
            Topic: topicIc Partition: 2    Leader: 0       Replicas: 0,1,2 Isr: 
2,0,1
    Topic:topicR   PartitionCount:3        ReplicationFactor:3     Configs:
            Topic: topicR  Partition: 0    Leader: 0       Replicas: 0,1,2 Isr: 
2,0,1
            Topic: topicR  Partition: 1    Leader: 0       Replicas: 1,2,0 Isr: 
0,2,1
            Topic: topicR  Partition: 2    Leader: 0       Replicas: 2,0,1 Isr: 
0,2,1
    Topic:topicSubE        PartitionCount:1        ReplicationFactor:3     
Configs:
            Topic: topicSubE       Partition: 0    Leader: 0       Replicas: 
0,2,1 Isr: 0,2,1
    Topic:topicSubIc       PartitionCount:1        ReplicationFactor:3     
Configs:
            Topic: topicSubIc      Partition: 0    Leader: 0       Replicas: 
2,0,1 Isr: 0,2,1
    Topic:topicSubLr       PartitionCount:1        ReplicationFactor:3     
Configs:
            Topic: topicSubLr      Partition: 0    Leader: 0       Replicas: 
0,1,2 Isr: 2,0,1
    Topic:topicSubR        PartitionCount:1        ReplicationFactor:3     
Configs:
            Topic: topicSubR       Partition: 0    Leader: 0       Replicas: 
0,1,2 Isr: 2,0,1
    
    
    As you would observe, the leader for all the partitions for all the topics 
is just one node: Broker node 0. This is true even for the '__consumer_offsets' 
topic.
    The brokers are up and running on all the 3 nodes and the Zookeeper nodes 
are also running fine.
    
    Does anybody have any ideas as to why this may have happened?
    Is there a way to manually trigger rebalance of partitions to the nodes?
    
    Regards,
    Rahul Misra
    This email message and any attachments are intended solely for the use of 
the addressee. If you are not the intended recipient, you are prohibited from 
reading, disclosing, reproducing, distributing, disseminating or otherwise 
using this transmission. If you have received this message in error, please 
promptly notify the sender by reply email and immediately delete this message 
from your system. This message and any attachments may contain information that 
is confidential, privileged or exempt from disclosure. Delivery of this message 
to any person other than the intended recipient is not intended to waive any 
right or privilege. Message transmission is not guaranteed to be secure or free 
of software viruses. 
    
***********************************************************************************************************************
    

Reply via email to