Hi, We are facing following issues with Kafka cluster.
- Kafka Version: 2.0.0 - We following cluster configuration: - Number of Broker: 14 - Per Broker: 37GB Memory and 14 Cores. - Topics: 40 - 50 - Partitions per topic: 32 - Replicas: 3 - Min In Sync Replica: 2 - __consumer_topic partition: 50 - offsets.topic.replication.factor=3 - default.replication.factor=3 - Consumers#: ~4000 (will grow to ~7K) - Consumer Groups#: ~4000 (will grow to ~7K) Imp: Here one consumer is consuming from one topic and one consumer group has only one consumer due to some architectural constraints. Two major problems we are facing with consumer group: - First time when we are starting consumer with new group name it working very well. But subsequent restart (with previous / older group name) is causing problems from some consumers. We are getting following errors: INFO [2019-08-28 19:05:34,481] [main] [AbstractCoordinator]: [Consumer clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2, groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2] Discovered group coordinator 10.XX.XXX.112:9092 (id: 2147483631 rack: null) INFO [2019-08-28 19:05:34,481] [main] [AbstractCoordinator]: [Consumer clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2, groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2] Group coordinator 10.XX.XXX.112:9092 (id: 2147483631 rack: null) is unavailable or invalid, will attempt rediscovery INFO [2019-08-28 19:05:34,582] [main] [AbstractCoordinator]: [Consumer clientId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2, groupId=djXXX#XXX-XXX-XXX-XX-5-1478729-XX-XXXXX-XX-ingestion-v2] Discovered group coordinator 10.32.197.112:9092 (id: 2147483631 rack: null) These messages are keep coming and consumer not able to start / poll. But if we change the group name then it works first time without any issue (and fails in subsequent restart). So it also means that there is no with issue broker. Will it because of having single consumer in consumer group, if yes then what will be the work around here? - The second error, we are getting when consumer is up and running. Then after couple hours, it starts failing and throwing following error: Consumer clientId=banneXXXX#XX-XXX-XXX-XXX-X-1388688-XXX-XXXXX, groupId=bannerXXX#XX-XXX-XXX-XXX-X-1388688-XXX-XXXXX] Offset commit failed on partition banneXXXX-7 at offset 13711176: This is not the correct coordinator [Consumer clientId=banXXerGrXXMXX#XX-XX-XXXXX-XXX-5-1478733-XXX-XXXXX-ingestion-v2, groupId=banXXerGrXXMXX#XX-XX-XXXXX-XXX-5-1478733-XXX-XXXXX-ingestion-v2] Offset commit failed on partition banXXerGrXXMXX-8 at offset 14741: This is not the correct coordinator. I wanted to know following things: - What is the max limit of consumer groups in a Kafka cluster, I didn't find any limitation on internet, all places it mentioned that limited by OS. - Is there a problem of a consumer group has only one consumer. - Is there some problem with my Kafka configuration, Regards Hrishikesh