Hi Team, We have 6 node kafka cluster (version 2.0.0). when i try to get the state of a consumer group by only specifying only one broker ip, I am getting different results (4 of the brokers are responding with 1 response and 2 of the brokers with another response.)
bin/kafka-consumer-groups.sh --bootstrap-server 10.32.218.112:9092 --describe --state --group consumer-group COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE #MEMBERS10.32.218.112:9092 (1) range Stable 1 bin/kafka-consumer-groups.sh --bootstrap-server 10.32.67.102:9092 --describe --state --group consumer-group COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE #MEMBERS10.32.218.112:9092 (1) range Stable 1 bin/kafka-consumer-groups.sh --bootstrap-server 10.33.150.9:9092 --describe --state --group consumer-group Consumer group 'consumer-group' has no active members. COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE #MEMBERS10.35.168.252:9092 (4) Empty 0 bin/kafka-consumer-groups.sh --bootstrap-server 10.35.168.252:9092 --describe --state --group consumer-group Consumer group 'consumer-group' has no active members. COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE #MEMBERS10.35.168.252:9092 (4) Empty 0 bin/kafka-consumer-groups.sh --bootstrap-server 10.33.21.48:9092 --describe --state --group consumer-group Consumer group 'consumer-group' has no active members. COORDINATOR (ID) ASSIGNMENT-STRATEGY STATE #MEMBERS10.35.168.252:9092 (4) Empty 0 I can also see the same behaviour with other consumer groups as well. There are few consumer groups which are active in both mini clusters (not sure what should be the appropriate name in this case). The validations i have done 1. all the brokers are active and are able to talk to each other. 2. all the brokers have all other brokers listed when we run bin/kafka-broker-api-versions.sh --bootstrap-server localhost:9092 | awk '/^[a-z]/ {print $1}' 3. checked controller ip from zookeeper and validated there are no anomalies in controller logs of all the boxes. 4. I am able to reproduce the same issue right now by doing these steps for a new consumer group a. start kafka consumer with group cg1 using brokerip 10.32.218.112:9092 b. validate that status using brokerip 10.32.218.112:9092 is showing consumer as live c. validate that status using brokerip 10.35.168.252:9092 is showing consumer not live d. start kafka consumer with group cg1 using brokerip 10.35.168.252:9092 e. validate that status using brokerip 10.32.218.112:9092 is showing consumer as live f. validate that status using brokerip 10.35.168.252:9092 is showing consumer as live but the consumer id both the brokers are reporting are different. Also when we stop both the consumers last read commit offset reported by both the brokers are different. Confirming that both the consumers are treated separately. 5. the only suspicious log that i found in one of the borker is WARN [2021-07-29 12:51:52,811] [kafka-request-handler-15][] state.change.logger - [Broker id=5] Ignoring LeaderAndIsr request from controller 1 with correlation id 2 epoch 5 for partition __consumer_offsets-15 since its associated leader epoch 101 is not higher than the current leader epoch 101 There are quite a few of these logs for different partitions, and also similar failure logs in controller logs of controller. I have tried searching on stackoverflow and kafka jira but not able to find relevant issue. hence reaching out to you. Can you please help with this? Regards Maneesh Bhunwal