Hi All,
Trying to figure out why my brokers have some disk imbalance I have found
that Kafka (maybe this is the way it is supposed to work?) is not spreading
all replicas to all available brokers.

I have been trying to figure out how a topic with 5 partitions with
replication_factor=3  (15 replicas) could endup having all replicas spread
over 9 brokers instead of 15, especially when there are more brokers than
the total replicas for that specific topic.

cluster has 48 brokers.

# topics.py describe -topic topic1
{145: 1, 148: 2, *101: 3*, 146: 1, 102: 2, 147: 1, 103: 2, 104: 2, 105: 1}
the keys are the brokerid and the values is how many replicas they have.

As you can see brokerid 101 has 3 replicas. which make the disk unbalanced
compared to other brokers.

I created a brand new topic in a test cluster with 24 brokers. topic has 5
partitions with replication factor 3
topics.py describe -topic test
{119: 1, 103: 1, 106: 2, 109: 1, 101: 2, 114: 1, 116: 2, 118: 1, 111: 2,
104: 1, 121: 1}

This time kafka decided to spread the replicas over 11 brokers instead of
15.
just for fun i ran a partition reassignment  for  topic test, spreading all
replicas to all brokers, result:

# topics.py describe -topic test
{110: 1, 111: 1, 109: 1, 108: 1, 112: 1, 103: 1, 107: 1, 105: 1, 104: 1,
106: 1, 102: 1, 118: 1, 116: 1, 113: 1, 117: 1}

Now all replicas are spread across 15 brokers.

Is there something I am missing? Maybe the reason is to keep network
chatter down?. By the way, I don't have any rack awareness configured.
Thanks!

Reply via email to