Take a look at bin/kafka-reassign-partitions.sh
Option Description ------ ----------- --broker-list <brokerlist> The list of brokers to which the partitions need to be reassigned in the form "0,1,2". This is required if --topics-to-move-json-file is used to generate reassignment configuration --execute Kick off the reassignment as specified by the --reassignment-json-file option. --generate Generate a candidate partition reassignment configuration. Note that this only generates a candidate assignment, it does not execute it. --reassignment-json-file <manual The JSON file with the partition assignment json file path> reassignment configurationThe format to use is - {"partitions": [{"topic": "foo", "partition": 1, "replicas": [1,2,3] }], "version":1 } --topics-to-move-json-file <topics to Generate a reassignment configuration reassign json file path> to move the partitions of the specified topics to the list of brokers specified by the --broker- list option. The format to use is - {"topics": [{"topic": "foo"},{"topic": "foo1"}], "version":1 } --verify Verify if the reassignment completed as specified by the --reassignment- json-file option. --zookeeper <urls> REQUIRED: The connection string for the zookeeper connection in the form host:port. Multiple URLS can be given to allow fail-over. Command must include exactly one action: --generate, --execute or --verify /******************************************* Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> ********************************************/ On Tue, Jun 24, 2014 at 7:44 PM, Virendra Pratap Singh < vpsi...@yahoo-inc.com.invalid> wrote: > Have a kafka cluster with 10 brokers (kafka 0.8.0). All of the brokers > were setup upfront. None was added later. Default number of partition is > set to 4 and default replication to 2. > Have 3 topics in the system. None of these topics are manually created > upfront, when the cluster is setup. So relying on kafka to automatically > create these topics when the producer(s) send data first time for each of > these topics. > We have multiple producer which will emit data for all of these topics at > any point of time. What it means is that kafka will be hit with producer > request simultaneously from multiple producer for producer request for > these 3 topics. > > What is observed is the topics partitions do not get spread out evenly in > this scenario. There are 10 brokers (ids 1-10) so expectation is that 3 * 4 > = 12 topic partitions should be spread out on all 10 servers. However in > this case the first 2 brokers share most of the load and few partitions are > spread out. The same is true for the replicated instances also. > > Here is the dump of list topic > > topic: topic1 partition: 0 leader: 1 replicas: 1,2 isr: 1,2 > topic: topic1 partition: 1 leader: 2 replicas: 2,1 isr: 2,1 > topic: topic1 partition: 2 leader: 1 replicas: 1,2 isr: 1,2 > topic: topic1 partition: 3 leader: 2 replicas: 2,1 isr: 2,1 > topic: topic2 partition: 0 leader: 9 replicas: 9,4 isr: > 9,4 > topic: topic2 partition: 1 leader: 10 replicas: 10,5 isr: > 10,5 > topic: topic2 partition: 2 leader: 1 replicas: 1,6 isr: > 1,6 > topic: topic2 partition: 3 leader: 2 replicas: 2,7 isr: > 2,7 > topic: topic3 partition: 0 leader: 2 replicas: 2,1 isr: 2,1 > topic: topic3 partition: 1 leader: 1 replicas: 1,2 isr: 1,2 > topic: topic3 partition: 2 leader: 2 replicas: 2,1 isr: 2,1 > topic: topic3 partition: 3 leader: 1 replicas: 1,2 isr: 1,2 > > So what is my options to have kafka evenly distribute the topic > partitions? Would pre creating the topics via create topic command help? > > Regards, > Virendra >