Hello, We have a Kafka cluster (2.4.1) with a replication factor of 3. I notice when we stop a broker that only one broker takes all the load from the missing broker and becomes the leader to all partitions. I would have thought that Kafka would split the load evenly among the remaining brokers.
So if I have this kind of configuration Topic: test Partition 0 - Leader: 1 - Replicas: 1,2,3 - Isr: 1,2,3 Partition 1 - Leader: 2 - Replicas: 2,3,1 - Isr: 1,2,3 Partition 2 - Leader: 3 - Replicas: 3,1,2 - Isr: 1,2,3 Partition 3 - Leader: 1 - Replicas: 1,2,3 - Isr: 1,2,3 Partition 4 - Leader: 2 - Replicas: 2,3,1 - Isr: 1,2,3 Partition 5 - Leader: 3 - Replicas: 3,1,2 - Isr: 1,2,3 If I stop broker 1, I want something like this (load is split evenly among broker 2 and 3): Topic: test Partition 0 - Leader: 2 - Replicas: 1,2,3 - Isr: 2,3 Partition 1 - Leader: 2 - Replicas: 2,3,1 - Isr: 2,3 Partition 2 - Leader: 3 - Replicas: 3,1,2 - Isr: 2,3 Partition 3 - Leader: 3 - Replicas: 1,2,3 - Isr: 2,3 Partition 4 - Leader: 2 - Replicas: 2,3,1 - Isr: 2,3 Partition 5 - Leader: 3 - Replicas: 3,1,2 - Isr: 2,3 What I observe is currently this (broker 2 takes all the load from broker 1): Partition 0 - Leader: 2 - Replicas: 1,2,3 - Isr: 2,3 Partition 1 - Leader: 2 - Replicas: 2,3,1 - Isr: 2,3 Partition 2 - Leader: 3 - Replicas: 3,1,2 - Isr: 2,3 Partition 3 - Leader: 2 - Replicas: 1,2,3 - Isr: 2,3 Partition 4 - Leader: 2 - Replicas: 2,3,1 - Isr: 2,3 Partition 5 - Leader: 3 - Replicas: 3,1,2 - Isr: 2,3 My concern here is that at all times, a broker should not exceed 50% of its network bandwidth which could be a problem in my case. Is there a way to change this behavior (manually by forcing a leader, programmatically, or by configuration)? >From my understanding, the script kafka-leader-election.sh allows only to set the preferred (the first in the list of replicas) or uncleaned (replicas not in sync can become a leader). Regards, Pierre