When running Kafka with multiple log directories 
kafka.log.LogManager.getOrCreateLog selects the first available log directory 
with the smallest number of topic partitions.
Topic partitions can have different sizes and this policy easily leads to data 
imbalances between log directories (or disks).

It isn't hard to change the policy (or add a configuration option to change it) 
so that the directory picked is the one with the smallest total size of logs 
i.e. the least used storage-wise. I have a patch and tests, what's the best way 
to go about this? Open a PR? Create a JIRA first? Create a KIP first?

Since the existing policy makes little sense IMO, should it be changed 
straightwaway or should we have an option to activate the correct behavior and 
keep the existing policy as default?

--
Igor Soarez

Reply via email to