Status

*Current state*: *Under Discussion*

*Author: Jerry Cai <https://cwiki.apache.org/confluence/display/~jerrycai> *

*Release: *

*Discussion thread*:

*JIRA*: KAFKA-17755
<https://issues.apache.org/jira/browse/KAFKA-17755> -
AbstractPartitionAssignor
can not enable RackAwareAssignment base on lead rack mode Reopened
Motivation

The current design of Kafka's rack-aware partition assignor introduces two
significant flaws:

   1.

   *Dependency on Broker-Side Configuration*:
   The replica.selector.class setting on the broker must be configured to
   RackAwareReplicaSelector. This violates the principle that partition
   assignors should be customizable independently by the client.
   2.

   *Violation of Kafka's Read-Write Consistency*:
   The existing approach disrupts Kafka's fundamental read-write
   consistency model, resulting in load imbalance and potential downstream
   inefficiencies.

These issues necessitate an improvement to ensure better alignment between
client independence and cluster balancing.
Public Interfaces

partition.assignment.strategy=org.apache.kafka.clients.consumer.LeaderRackAwareCooperativeStickyAssignor

or

partition.assignment.strategy=org.apache.kafka.clients.consumer.LeaderRackAwareRangeAssignor
Proposed Changes2.1 Core Ideas

The proposed changes aim to address the issues by:

   1.

   *Reading Only from Leader Brokers*:
   Clients will always fetch messages from the leader replica, bypassing
   the need for replica.selector.class on brokers. This restores Kafka's
   read-write consistency model.
   2.

   *Balancing Based on Leader Rack Information*:
   Balancing decisions will rely solely on the rack information of the
   leader replica. This simplifies the logic and ensures initial balance
   across racks.
   3.

   *Optimizing Partition Assignments*:
   When balance is achieved, partition assignors will prioritize assigning
   partitions within the same rack as the leader replica whenever possible,
   reducing cross-rack traffic.

2.2 New Partition Assignor Algorithm

The modified rack-aware partition assignor will:

   1. Collect rack metadata of the leader replicas during assignment.
   2. Distribute partitions across racks in a balanced manner while
   ensuring clients fetch from the leader replicas.
   3. Apply secondary optimization to allocate partitions within the same
   rack as the leader when rack balance is maintained.

Compatibility, Deprecation, and Migration Plan

This change will not impact existing configurations where the
RackAwareReplicaSelector is already in use. However, it provides an
alternative mechanism that eliminates the dependency on broker-side
settings, offering more flexibility for client-side customizations.
Test Plan

   - Validate the new assignor logic across various cluster configurations
   and sizes.
   - Measure improvements in load balancing and adherence to rack-awareness
   principles.
   - Verify that read-write consistency is preserved under all conditions.

Rejected Alternatives

   -

   *Continuing with Broker-Dependent Configurations*:
   This was deemed counterproductive as it limits client independence and
   disrupts load balancing.
   -

   *Full Deprecation of Rack-Aware Assignor*:
   Rack awareness is critical for high availability and fault tolerance;
   thus, its complete removal was not considered.

Impact on Users

Users will benefit from:

   - Independent client-side customization of partition assignors without
   broker configuration changes.
   - Improved load balancing and reduced cross-rack traffic.
   - Preservation of Kafka's core read-write consistency model.

Reply via email to