Ara Zarifian created KAFKA-12150: ------------------------------------ Summary: Consumer group refresh not working with clustered MM2 setup Key: KAFKA-12150 URL: https://issues.apache.org/jira/browse/KAFKA-12150 Project: Kafka Issue Type: Bug Components: mirrormaker Affects Versions: 2.7.0 Reporter: Ara Zarifian
I'm running MM2 with Kafka 2.7 with the following configuration: {code} clusters = eastus2, westus eastus2.bootstrap.servers = clusrter1.example.com:9092 westus.bootstrap.servers = cluster2.example.com:9092 eastus2->westus.enabled = true eastus2->westus.topics = .* westus->eastus2.enabled = true westus->eastus2.topics = .* refresh.topics.enabled = true refresh.topics.interval.seconds = 5 refresh.groups.enabled = true refresh.groups.interval.seconds = 5 sync.topic.configs.enabled = true sync.topic.configs.interval.seconds = 5 sync.topic.acls.enabled = false sync.topic.acls.interval.seconds = 5 sync.group.offsets.enabled = true sync.group.offsets.interval.seconds = 5 emit.checkpoints.enabled = true emit.checkpoints.interval.seconds = 5 emit.heartbeats.enabled = true emit.heartbeats.interval.seconds = 5 replication.factor = 3 checkpoints.topic.replication.factor = 3 heartbeats.topic.replication.factor = 3 offset-syncs.topic.replication.factor = 3 offset.storage.replication.factor = 3 status.storage.replication.factor = 3 config.storage.replication.factor = 3 {code} More specifically, I'm running multiple instances of MM2 with the above configuration within Kubernetes pods. I was testing the new automatic consumer group offset translation functionality and noticed what appears to be a problem when running more than 1 instance of MM2 in this fashion. Based on [on the KEP|https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0], I should be able to run multiple instances in this manner (see "Running a dedicated MirrorMaker cluster"), however, I noticed that when enabling replication using a 3-instance MM2 cluster, consumer groups were not synchronizing across clusters at all. When running through my test case with a single MM2 instance, consumer group synchronization appears to work as expected consistently. When running through my 3-node test case, synchronization begins as soon as I scale the number of replicas to 1. Am I misinterpreting the manner in which the KEP describes MM2 clusters or is this interaction an unexpected one? -- This message was sent by Atlassian Jira (v8.3.4#803005)