Hi Team,

We have 6 node kafka cluster (version 2.0.0). when i try to get the state
of a consumer group by only specifying only one broker ip, I am getting
different results (4 of the brokers are responding with 1 response and 2 of
the brokers with another response.)

bin/kafka-consumer-groups.sh  --bootstrap-server 10.32.218.112:9092
--describe --state  --group consumer-group
COORDINATOR (ID)          ASSIGNMENT-STRATEGY       STATE
  #MEMBERS10.32.218.112:9092 (1)    range                     Stable
            1


bin/kafka-consumer-groups.sh  --bootstrap-server 10.32.67.102:9092
--describe --state  --group consumer-group
COORDINATOR (ID)          ASSIGNMENT-STRATEGY       STATE
  #MEMBERS10.32.218.112:9092 (1)    range                     Stable
            1


bin/kafka-consumer-groups.sh  --bootstrap-server 10.33.150.9:9092
--describe --state  --group consumer-group
Consumer group 'consumer-group' has no active members.
COORDINATOR (ID)          ASSIGNMENT-STRATEGY       STATE
  #MEMBERS10.35.168.252:9092 (4)                              Empty
            0


bin/kafka-consumer-groups.sh  --bootstrap-server 10.35.168.252:9092
--describe --state  --group consumer-group
Consumer group 'consumer-group' has no active members.
COORDINATOR (ID)          ASSIGNMENT-STRATEGY       STATE
  #MEMBERS10.35.168.252:9092 (4)                              Empty
            0

bin/kafka-consumer-groups.sh  --bootstrap-server 10.33.21.48:9092
--describe --state  --group consumer-group
Consumer group 'consumer-group' has no active members.
COORDINATOR (ID)          ASSIGNMENT-STRATEGY       STATE
  #MEMBERS10.35.168.252:9092 (4)                              Empty
            0


I can also see the same behaviour with other consumer groups as well.
There are few consumer groups which are active in both mini clusters
(not sure what should be the appropriate name in this case).

The validations i have done

1. all the brokers are active and are able to talk to each other.

2. all the brokers have all other brokers listed when we run
bin/kafka-broker-api-versions.sh --bootstrap-server localhost:9092 |
awk '/^[a-z]/ {print $1}'

3. checked controller ip from zookeeper and validated there are no
anomalies in controller logs of all the boxes.

4. I am able to reproduce the same issue right now by doing these
steps for a new consumer group

     a. start kafka consumer with group cg1 using brokerip 10.32.218.112:9092
     b. validate that status using brokerip 10.32.218.112:9092 is
showing consumer as live
     c. validate that status using brokerip 10.35.168.252:9092 is
showing consumer not live
     d. start kafka consumer with group cg1 using brokerip 10.35.168.252:9092
     e. validate that status using brokerip 10.32.218.112:9092 is
showing consumer as live
     f. validate that status using brokerip 10.35.168.252:9092 is
showing consumer as live

   but the consumer id both the brokers are reporting are different.
Also when we stop both the consumers last read commit offset reported
by both the brokers are different.
      Confirming that both the consumers are treated separately.

5. the only suspicious log that i found in one of the borker is

     WARN [2021-07-29 12:51:52,811] [kafka-request-handler-15][]
state.change.logger - [Broker id=5] Ignoring LeaderAndIsr request from
controller 1 with correlation id 2 epoch 5 for
     partition __consumer_offsets-15 since its associated leader epoch
101 is not higher than the current leader epoch 101

     There are quite a few of these logs for different partitions, and also
similar failure logs in controller logs of controller.


I have tried searching on stackoverflow and kafka jira but not able to find
relevant issue. hence reaching out to you. Can you please help with this?

Regards
Maneesh Bhunwal

Reply via email to