Hello,

I'm trying to work with MirrorMaker 2 and have been having some troubles with 
the configuration and my end result. As I understand, it seems there's updated 
documentation coming in the next Kafka release?

I've been tasked with examining MM2 as a solution to an Active->Stand-by two 
cluster configuration. We would like to have our active cluster constantly 
replicating to a disaster recovery cluster so that we can quickly fail-over if 
something horrible happens to the active cluster. I'm aware there's other, 
potentially easier, ways to solve this problem but this is what I'm tasked with 
trying.

The first trouble I ran into was the difference in configuration between using 
the Connect JSON API and trying to run in dedicated stand-alone mode. Perhaps 
this is better documented elsewhere but it was quite confusing for me, I 
ultimately ended with something resembling this:

{
  "name": "efw-mirrormaker-source-nonprod-target-dr-1",
  "connector.class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
  "tasks.max": 1,
  "topics": "mm2\\.efw.*",
  "errors.log.enable": true,
  "errors.log.include.messages": true,
  "key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
  "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
  "clusters": "mm2.source.nonprod, dr",
  "source.cluster.alias": "mm2.source.nonprod",
  "target.cluster.alias": "dr",
  "source.cluster.bootstrap.servers": "kafka-bootstrap:9092",
  "target.cluster.bootstrap.servers": "kafka-boostrap-dr:9092",
  "source.admin.client.id": "efwmm2sourceadmin",
  "source.cluster.security.protocol": "SSL",
  "target.admin.client.id": "efwmm2targetadmin"
}

Which successfully ran but has some key differences to the following properties 
file I used to configure standalone mode:

name = efw-mirrormaker-source-nonprod-target-dr-2
tasks.max = 1
emit.heartbeats.enabled = false
sync.topics.configs.enabled = true
sync.topic.acls.enabled = true
errors.log.enable = true
errors.log.include.messages = true
key.converter = org.apache.kafka.connect.converters.ByteArrayConverter
value.converter = org.apache.kafka.connect.converters.ByteArrayConverter
clusters = mm2.source.nonprod, dr
mm2.source.nonprod.bootstrap.servers = kafka-bootstrap:9092
dr.bootstrap.servers = kafka-dr-bootstrap:9092
mm2.source.nonprod.client.id = efw-mm2-source-nonprod
dr.client.id = efw-mm2-dr
mm2.source.nonprod.security.protocol = SSL
mm2.source.nonprod.cluster.security.protocol = SSL
mm2.source.nonprod->dr.enabled = true
mm2.source.nonprod->dr.emit.heartbeats.enabled = false
mm2.source.nonprod->dr.emit.checkpoints.enabled = true
mm2.source.nonprod->dr.sync.topics.configs.enabled = true
mm2.source.nonprod->dr.sync.topics.acls.enabled = true
mm2.source.nonprod->dr.topics = mm2\\.efw.*
mm2.source.nonprod.group.id = efw-mm2-group1

Which leads me to the bulk of my questions:

First, does anybody have the differences in configuration well documented 
anywhere? For instance, the "A->B" properties don't seem to be functional at 
all on the JSON API. Additionally, configuring the admin client in JSON 
requires "source.cluster.{config property}" whereas in the properties file it's 
"{your source alias}.{config property}".

Secondly, I seem unable to turn off Heartbeats when using the dedicated 
stand-alone method. I've tried every combination of configuration settings I 
can imagine and I was never able to get it to actually stop. I mainly wanted to 
turn it off because it was flooding my logs due to ACL issues on the 
"heartbeat" topic. I managed to solve that part, but I still can't disable 
heartbeats, am I just misconfiguring? Has anybody else had this issue?

Thirdly, and lastly, I've managed to get MirrorMaker 2 running and mirroring 
data using the dedicated standalone method, but I'm receiving 3 copies of each 
message on my mirrored topics (the same issue mentioned in this SO post: 
https://stackoverflow.com/questions/60005462/kafka-mirrormaker-2-0-duplicate-each-messages).
 For instance, if I have the message "Foo" in my active cluster, in the DR 
cluster I receive 3 separate messages "Foo" "Foo" "Foo" in the same topic. This 
seems independent of partition, replication factors, how many brokers are 
running, if I'm running locally (in cluster) or remotely, and I'm quite 
confused as to why. Does anybody have any guidance here?

Thanks for taking a moment to read this, I'm hoping I'm not alone in these 
issues and I can help shine some light for others.

Reply via email to