Apologies, never mind this. I realize now I've confused Log objects with Topic Partitions. So since Log size is constant, number of logs in the log directory is actually a good criteria.
-- Igor Soarez On Thu, May 7, 2020, at 4:41 PM, Igor Soarez wrote: > > When running Kafka with multiple log directories > kafka.log.LogManager.getOrCreateLog selects the first available log > directory with the smallest number of topic partitions. > Topic partitions can have different sizes and this policy easily leads > to data imbalances between log directories (or disks). > > It isn't hard to change the policy (or add a configuration option to > change it) so that the directory picked is the one with the smallest > total size of logs i.e. the least used storage-wise. I have a patch and > tests, what's the best way to go about this? Open a PR? Create a JIRA > first? Create a KIP first? > > Since the existing policy makes little sense IMO, should it be changed > straightwaway or should we have an option to activate the correct > behavior and keep the existing policy as default? > > -- > Igor Soarez > >