pjl1070048431 opened a new pull request, #19297:
URL: https://github.com/apache/kafka/pull/19297

   ## Motivation
   
   Kafka clusters typically require rebalancing of topic replicas after 
horizontal scaling to evenly distribute the load across new and existing 
brokers. The current rebalancing approach does not consider the existing 
replica distribution, often resulting in excessive and unnecessary replica 
movements. These unnecessary movements increase rebalance duration, consume 
significant bandwidth and CPU resources, and potentially disrupt ongoing 
production and consumption operations. Thus, a replica rebalancing strategy 
that minimizes movements while achieving an even distribution of replicas is 
necessary.
   
   ## Goals
   
   The new replica rebalancing strategy aims to achieve the following 
objectives:
   
   1. **Minimal Movement**: Minimize the number of replica relocations during 
rebalancing.
   2. **Replica Balancing**: Ensure that replicas are evenly distributed across 
brokers.
   3. **Anti-Affinity Support**: Support rack-aware allocation when enabled.
   4. **Leader Balancing**: Distribute leader replicas evenly across brokers.
   5. **ISR Order Optimization**: Optimize adjacency relationships to prevent 
failover traffic concentration in case of broker failures.
   
   ## Proposed Changes
   
   ### Rack-Level Replica Distribution
   
   The following rules ensure balanced replica allocation at the rack level:
   
   1. **When** `**rackCount = replicationFactor**`:
      - Each rack receives exactly `partitionCount` replicas.
   2. **When** `**rackCount > replicationFactor**`:
      - If weighted allocation `(rackBrokers/totalBrokers × totalReplicas) ≥ 
partitionCount`: each rack receives exactly `partitionCount` replicas.
      - If weighted allocation `< partitionCount`: distribute remaining 
replicas using a weighted remainder allocation.
   
   ### Node-Level Replica Distribution
   
   1. If the number of replicas assigned to a rack is not a multiple of the 
number of nodes in that rack, some nodes will host one additional replica 
compared to others.
   2. **When** `**rackCount = replicationFactor**`:
      - If all racks have an equal number of nodes, each node will host an 
equal number of replicas.
      - If rack sizes vary, nodes in larger racks will host fewer replicas on 
average.
   3. **When** `**rackCount > replicationFactor**`:
      - If no rack has a significantly higher node weight, replicas will be 
evenly distributed.
      - If a rack has disproportionately high node weight, those nodes will 
receive fewer replicas.
   
   ### Anti-Affinity Support
   
   When anti-affinity is enabled, the rebalance algorithm ensures that replicas 
of the same partition do not colocate on the same rack. Brokers without rack 
configuration are excluded from anti-affinity checks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to