Emanuele Sabellico created KAFKA-17237: ------------------------------------------
Summary: [rack-aware assignors] Rebalance is triggered every time a broker isn't reported from a metadata call Key: KAFKA-17237 URL: https://issues.apache.org/jira/browse/KAFKA-17237 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 3.8.0, 3.5.0 Reporter: Emanuele Sabellico Attachments: test.log When configuring a client for rack-awareness to enable FFF and rack-aware assignors, a rebalance is triggered every time a broker disappears from a Metadata response, such as during a cluster roll. That happens because after KIP 881 metadata appears as changed given the set of racks is different (brokers that are down have no info about the rack) *How to reproduce* * Enable *client.rack* on the client and *broker.rack* on the cluster * Create a topic with replicas on all the nodes * Subscribe to that topic on the client * Stop one of the brokers * Observe a rebalance is triggered Attached is a log reproducing the issue in Java client. A few lines showing the rejoin requests {noformat} [2024-08-01 15:09:07,472] INFO [Consumer clientId=consumer-test_racks-1, groupId=test_racks] Request joining group due to: cached metadata has changed from (version4: {test_new=[racks=[null, 1b, 1c]]}) at the beginning of the rebalance to (version5: {test_new=[racks=[1a, 1b, 1c]]}) (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) [2024-08-01 15:10:38,689] INFO [Consumer clientId=consumer-test_racks-1, groupId=test_racks] Request joining group due to: cached metadata has changed from (version6: {test_new=[racks=[1a, 1b, 1c]]}) at the beginning of the rebalance to (version42: {test_new=[racks=[null, 1a, 1c]]}) (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) [2024-08-01 15:11:04,106] INFO [Consumer clientId=consumer-test_racks-1, groupId=test_racks] Request joining group due to: cached metadata has changed from (version43: {test_new=[racks=[1a, 1b, 1c]]}) at the beginning of the rebalance to (version45: {test_new=[racks=[null, 1a, 1b]]}) (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator) {noformat} Same happens in librdkafka as reported in this issue [https://github.com/confluentinc/librdkafka/issues/4742] -- This message was sent by Atlassian Jira (v8.20.10#820010)