[
https://issues.apache.org/jira/browse/KAFKA-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498863#comment-17498863
]
Guram Savinov edited comment on KAFKA-10857 at 3/2/22, 8:23 PM:
----------------------------------------------------------------
The problem still exists in Kafka 3.x
Starting instances one by one doesn't work.
Take a look at KAFKA-10586 and KAFKA-9981
was (Author: gsavinov):
The problem still exists in Kafka 3.x
Starting pods one by one doesn't work.
Take a look at KAFKA-10586 and KAFKA-9981
> Mirror Maker 2 - replication not working when deploying multiple instances
> --------------------------------------------------------------------------
>
> Key: KAFKA-10857
> URL: https://issues.apache.org/jira/browse/KAFKA-10857
> Project: Kafka
> Issue Type: Bug
> Components: KafkaConnect, mirrormaker
> Affects Versions: 2.6.0, 2.5.1
> Reporter: Athanasios Fanos
> Priority: Major
>
> We believe we are experiencing a bug when deploying Mirror Maker 2 in
> distributed mode in our environments. Replication does not work consistently
> after initial deployment and does not start working even after some time
> (24h+).
> *Environment & replication set-up*
> * 2 regions with a separate Kafka cluster (let's call them Region A and
> Region B)
> * 3 instances of Mirror maker are deployed at the same time in Region B with
> the same configuration
> * Replication is set up to be bi-directional (regionA->regionB &
> regionB->regionA)
> *Container Version*
> Observed with both {{confluentinc/cp-kafka:5.5.1}} &
> {{confluentinc/cp-kafka:6.0.1}}
> *Mirror maker 2 configuration*
> {code:java}
> clusters=regionA,regionB
> regionA.bootstrap.servers=regionA-kafka:9092
> regionB.bootstrap.servers=regionB-kafka:9092
> regionA->regionB.enabled=true
> regionA->regionB.topics=testTopic
> regionB->regionA.enabled=true
> regionB->regionA.topics=testTopic
> sync.topic.acls.enabled=false
> tasks.max=9
> {code}
> *Observed behavior*
> * After deploying the 3 Mirror Maker instances (at the same time),
> replication for 1 or both mirrors does not work
> ** If we scale down to a single instance of mirror maker and wait for about
> 5 minutes (refresh.topics.interval.seconds?) replication starts working.
> After this scaling up to 3 correctly distributes the load between the
> deployed instances
> *Expected behavior*
> * Replication should work for all configured mirrors when running in
> distributed mode
> * When starting multiple instances of Mirror Maker at the same time
> replication should work, 1 by 1 rollout should not be required
> *Additional details*
> * When replication is not working, we observe that in the internal config
> topics from Mirror Maker the partitions are not assigned to the tasks, eg
> {{task.assigned.partitions}} are not set at all under the properties object.
> *Workaround*
> * As a workaround, we start Mirror Maker instances 1 by 1 with some delay
> between each instance. This allows for the first instance to set-up the
> configuration in the internal topics correctly. Doing this seems to ensure
> that replication works as expected.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)