Status
Current state: Under Discussion
Author: Jerry Cai
Release:
Discussion thread: here
JIRA: - KAFKA-17755Getting issue details... STATUS
Motivation
The current design of Kafka's rack-aware partition assignor introduces two significant flaws:
Dependency on Broker-Side Configuration:
Thereplica.selector.class
setting on the broker must be configured toRackAwareReplicaSelector
. This violates the principle that partition assignors should be customizable independently by the client.Violation of Kafka's Read-Write Consistency:
The existing approach disrupts Kafka's fundamental read-write consistency model, resulting in load imbalance and potential downstream inefficiencies.
These issues necessitate an improvement to ensure better alignment between client independence and cluster balancing.
Public Interfaces
partition.assignment.strategy=org.apache.kafka.clients.consumer.LeaderRackAwareCooperativeStickyAssignor
or
partition.assignment.strategy=org.apache.kafka.clients.consumer.LeaderRackAwareRangeAssignor
Proposed Changes
2.1 Core Ideas
The proposed changes aim to address the issues by:
Reading Only from Leader Brokers:
Clients will always fetch messages from the leader replica, bypassing the need forreplica.selector.class
on brokers. This restores Kafka's read-write consistency model.Balancing Based on Leader Rack Information:
Balancing decisions will rely solely on the rack information of the leader replica. This simplifies the logic and ensures initial balance across racks.Optimizing Partition Assignments:
When balance is achieved, partition assignors will prioritize assigning partitions within the same rack as the leader replica whenever possible, reducing cross-rack traffic.
2.2 New Partition Assignor Algorithm
The modified rack-aware partition assignor will:
- Collect rack metadata of the leader replicas during assignment.
- Distribute partitions across racks in a balanced manner while ensuring clients fetch from the leader replicas.
- Apply secondary optimization to allocate partitions within the same rack as the leader when rack balance is maintained.
Compatibility, Deprecation, and Migration Plan
This change will not impact existing configurations where the RackAwareReplicaSelector
is already in use. However, it provides an alternative mechanism that eliminates the dependency on broker-side settings, offering more flexibility for client-side customizations.
Test Plan
- Validate the new assignor logic across various cluster configurations and sizes.
- Measure improvements in load balancing and adherence to rack-awareness principles.
- Verify that read-write consistency is preserved under all conditions.
Rejected Alternatives
Continuing with Broker-Dependent Configurations:
This was deemed counterproductive as it limits client independence and disrupts load balancing.Full Deprecation of Rack-Aware Assignor:
Rack awareness is critical for high availability and fault tolerance; thus, its complete removal was not considered.
Impact on Users
Users will benefit from:
- Independent client-side customization of partition assignors without broker configuration changes.
- Improved load balancing and reduced cross-rack traffic.
- Preservation of Kafka's core read-write consistency model.