I have written such a script. It balances the cluster by the data size on disk. It is written using lots of internal tools which is why its not open-sourced. I plan to re-write it without the internal tooling.
In terms of leader balancing, when using the partition-reassignemnt script, whichever broker is specified first within the list of brokers will be the leader. -Clark Clark Elliott Haskins III LinkedIn DDS Site Reliability Engineer Kafka, Zookeeper, Samza SRE Mobile: 505.385.1484 BlueJeans: https://www.bluejeans.com/chaskins chask...@linkedin.com https://www.linkedin.com/in/clarkhaskins There is no place like 127.0.0.1 On 7/10/14, 6:07 PM, "Florian Dambrine" <flor...@gumgum.com> wrote: >Thanks for your answer, > >Indeed, I have already worked on this kind of script. I ended up with 800 >lines of groovy script that rebalance partitions across the cluster and >minimizing the number of partition moves. I also worked on the partition >leadership balancing. > >I still have to work on my script because I end up with some unbalanced >leaders. The number of partition leaded by one node is on average 17 but I >have one node that ends with 12 leads and an other with 21. >I am gonna introduce swaps to re equilibrate the leadership. > >Have you ever worked on this kind of script? I could not find any >open-source code on GitHub... > >Do you have any suggestions? > >Just in case if you want to have a look I have published my code on >GitHub ( >https://github.com/Lowess/Kafka/blob/master/com/gumgum/kafka/KafkaManualPa >rtitionRebalancer.groovy >) > >Thanks