Hi Joe,

   Thanks for the info. I am aware of the reassignment thingy. I was
trying to understand why the uneven distribution in the first place.

Regards,
Virendra

On 6/24/14, 8:41 PM, "Joe Stein" <joe.st...@stealth.ly> wrote:

>Take a look at
>
>bin/kafka-reassign-partitions.sh
>
>Option                                  Description
>
>------                                  -----------
>
>--broker-list <brokerlist>              The list of brokers to which the
>
>                                          partitions need to be reassigned
>in
>                                          the form "0,1,2". This is
>required
>                                          if --topics-to-move-json-file is
>
>                                          used to generate reassignment
>
>                                          configuration
>
>--execute                               Kick off the reassignment as
>specified
>                                          by the --reassignment-json-file
>
>                                          option.
>
>--generate                              Generate a candidate partition
>
>                                          reassignment configuration. Note
>
>                                          that this only generates a
>candidate
>                                          assignment, it does not execute
>it.
>--reassignment-json-file <manual        The JSON file with the partition
>
>  assignment json file path>              reassignment configurationThe
>format
>                                          to use is -
>
>                                        {"partitions":
>
>                                        [{"topic": "foo",
>
>                                          "partition": 1,
>
>                                          "replicas": [1,2,3] }],
>
>                                        "version":1
>
>                                        }
>
>--topics-to-move-json-file <topics to   Generate a reassignment
>configuration
>  reassign json file path>                to move the partitions of the
>
>                                          specified topics to the list of
>
>                                          brokers specified by the
>--broker-
>                                          list option. The format to use
>is
>-
>                                        {"topics":
>
>                                        [{"topic": "foo"},{"topic":
>"foo1"}],
>                                        "version":1
>
>                                        }
>
>--verify                                Verify if the reassignment
>completed
>                                          as specified by the
>--reassignment-
>                                          json-file option.
>
>--zookeeper <urls>                      REQUIRED: The connection string
>for
>
>                                          the zookeeper connection in the
>form
>                                          host:port. Multiple URLS can be
>
>                                          given to allow fail-over.
>
>Command must include exactly one action: --generate, --execute or --verify
>
>/*******************************************
> Joe Stein
> Founder, Principal Consultant
> Big Data Open Source Security LLC
> http://www.stealth.ly
> Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
>********************************************/
>
>
>On Tue, Jun 24, 2014 at 7:44 PM, Virendra Pratap Singh <
>vpsi...@yahoo-inc.com.invalid> wrote:
>
>> Have a kafka cluster with 10 brokers (kafka 0.8.0).  All of the brokers
>> were setup upfront. None was added later. Default number of partition is
>> set to 4 and default replication to 2.
>> Have 3 topics in the system. None of these topics are manually created
>> upfront, when the cluster is setup. So relying on kafka to automatically
>> create these topics when the producer(s) send data first time for each
>>of
>> these topics.
>> We have multiple producer which will emit data for all of these topics
>>at
>> any point of time. What it means is that kafka will be hit with producer
>> request simultaneously from multiple producer for producer request for
>> these 3 topics.
>>
>> What is observed is the topics partitions do not get spread out evenly
>>in
>> this scenario. There are 10 brokers (ids 1-10) so expectation is that 3
>>* 4
>> = 12 topic partitions should be spread out on all 10 servers. However in
>> this case the first 2 brokers share most of the load and few partitions
>>are
>> spread out. The same is true for the replicated instances also.
>>
>> Here is the dump of list topic
>>
>> topic: topic1  partition: 0    leader: 1       replicas: 1,2   isr: 1,2
>> topic: topic1  partition: 1    leader: 2       replicas: 2,1   isr: 2,1
>> topic: topic1  partition: 2    leader: 1       replicas: 1,2   isr: 1,2
>> topic: topic1  partition: 3    leader: 2       replicas: 2,1   isr: 2,1
>> topic: topic2        partition: 0    leader: 9       replicas: 9,4
>>isr:
>> 9,4
>> topic: topic2        partition: 1    leader: 10      replicas: 10,5
>>isr:
>> 10,5
>> topic: topic2        partition: 2    leader: 1       replicas: 1,6
>>isr:
>> 1,6
>> topic: topic2        partition: 3    leader: 2       replicas: 2,7
>>isr:
>> 2,7
>> topic: topic3     partition: 0    leader: 2       replicas: 2,1   isr:
>>2,1
>> topic: topic3     partition: 1    leader: 1       replicas: 1,2   isr:
>>1,2
>> topic: topic3     partition: 2    leader: 2       replicas: 2,1   isr:
>>2,1
>> topic: topic3     partition: 3    leader: 1       replicas: 1,2   isr:
>>1,2
>>
>> So what is my options to have kafka evenly distribute the topic
>> partitions? Would pre creating the topics via create topic command help?
>>
>> Regards,
>> Virendra
>>

Reply via email to