[ https://issues.apache.org/jira/browse/KAFKA-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yonghui Yang updated KAFKA-2796: -------------------------------- Affects Version/s: (was: 0.9.0.0) Description: Currently when creating a log, the directory is chosen by calculating the number of partitions in each directory and then choosing the data directory with the fewest partitions. However, the sizes of different TopicParitions are very different, which lead to usage vary greatly between different logDirs. And usually each logDir corresponds to a disk, so the disk usage between different disks is very imbalance . The possible solution is to reassign partitions in high-usage logDirs to low-usage logDirs. I change the format of /admin/reassign_partitions,add replicaDirs field. At reassigning Partitions, when broker’s LogManager.createLog() is invoked , if replicaDir is specified , the specified logDir will be chosen, otherwise the logDir with the fewest partitions will be chosen. the old /admin/reassign_partitions: {"version":1, "partitions": [ { "topic" : "Foo", "partition": 1, "replicas": [1, 2, 3] } ] } the new /admin/reassign_partitions: {"version":1, "partitions": [ { "topic" : "Foo", "partition": 1, "replicas": [1, 2, 3], "replicaDirs": {"1":"/data1/kafka_data", "3":"/data10/kakfa_data" } } ] } This feature has been developed. PR: https://github.com/apache/kafka/pull/484 was: Currently when creating a log, the directory is chosen by calculating the number of partitions in each directory and then choosing the data directory with the fewest partitions. However, the sizes of different TopicParitions are very different, which lead to usage vary greatly between different logDirs. And usually each logDir corresponds to a disk, so the disk usage between different disks is very imbalance . The possible solution is to reassign partitions in high-usage logDirs to low-usage logDirs. I change the format of /admin/reassign_partitions,add replicaDirs field. At reassigning Partitions, when broker’s LogManager.createLog() is invoked , if replicaDir is specified , the specified logDir will be chosen, otherwise the logDir with the fewest partitions will be chosen. the old /admin/reassign_partitions: {"version":1, "partitions": [ { "topic" : "Foo", "partition": 1, "replicas": [1, 2, 3] } ] } the new /admin/reassign_partitions: {"version":1, "partitions": [ { "topic" : "Foo", "partition": 1, "replicas": [1, 2, 3], "replicaDirs": {"1":"/data1/kafka_data", "3":"/data10/kakfa_data" } } ] } > add support for reassignment partition to specified logdir > ---------------------------------------------------------- > > Key: KAFKA-2796 > URL: https://issues.apache.org/jira/browse/KAFKA-2796 > Project: Kafka > Issue Type: Improvement > Components: clients, controller, core, log > Reporter: Yonghui Yang > Assignee: Neha Narkhede > Labels: features > Fix For: 0.9.0.0 > > > Currently when creating a log, the directory is chosen by calculating the > number of partitions > in each directory and then choosing the data directory with the fewest > partitions. > However, the sizes of different TopicParitions are very different, which lead > to usage vary greatly between different logDirs. And usually each logDir > corresponds to a disk, so the disk usage between different disks is very > imbalance . > The possible solution is to reassign partitions in high-usage logDirs to > low-usage logDirs. I change the format of /admin/reassign_partitions,add > replicaDirs field. At reassigning Partitions, when broker’s > LogManager.createLog() is invoked , if replicaDir is specified , the > specified logDir will be chosen, otherwise the logDir with the fewest > partitions will be chosen. > the old /admin/reassign_partitions: > {"version":1, > "partitions": > [ > { > "topic" : "Foo", > "partition": 1, > "replicas": [1, 2, 3] > } > ] > } > the new /admin/reassign_partitions: > {"version":1, > "partitions": > [ > { > "topic" : "Foo", > "partition": 1, > "replicas": [1, 2, 3], > "replicaDirs": {"1":"/data1/kafka_data", "3":"/data10/kakfa_data" } > } > ] > } > This feature has been developed. > PR: https://github.com/apache/kafka/pull/484 -- This message was sent by Atlassian JIRA (v6.3.4#6332)