Kafka connect replication using MirrorMaker 2.0
Hi everyone, I'm trying to create an active-active deployment of a kafka cluster between two data centers using MirrorMaker2, but I'm facing a problem. In my deployment I have Kafka Connect in both sites which each of them connect to different database using sink and source connectors (MongoDB source connector , JDBC sink/source connector) I’d like to know what’s the best practice for active-active is using Kafka connect , since I noticed the “connect-offsets” topic is not replicated in mm2. Best regards, Daniel
Re: Mirrormaker 2.0 - duplicates with idempotence enabled
Thank you Ning for your response. Could you please indicate if there is some statement or specific code part where is mentioned the “at-least” delivery guarantee? Just for the record. Kind Regards, Από: Ning Zhang Αποστολή: Τετάρτη, 17 Μαρτίου 2021 22:39 Προς: users@kafka.apache.org Θέμα: Re: Mirrormaker 2.0 - duplicates with idempotence enabled Hello Vangelis, By default, current MM 2.0 guarantees "at-least" once delivery guarantee, meaning there will be duplicate messages under some failure scenarios. If you prefer to no-message loss, there is a pending PR about MM 2.0 https://issues.apache.org/jira/browse/KAFKA-10339 On 2021/03/10 07:45:17, Vangelis Typaldos wrote: > Hi, > > I have setup mirrormaker2 (Kafka v.2.6.0) on 2 clusters (CL1,CL2) and the > mirroring seems to work properly except with an issue with duplicates in the > following scenario: > While both clusters are up and running i simulate an incident, stopping one > by one the brokers of the CL2 cluster. Stopping the first two brokers does > not generate any issue. All messages of my test topic are mirrored without > problems on CL1.topic of CL2 cluster. After stopping the last broker, > obviously will stop mirroring messages in the CL2 side as all brokers are > down. There is always active a producer that feeds with messages during the > test on topic of CL1. > The problem starts on restarting the brokers. After starting the first broker > i note that some messages (about 5%) are duplicated. I have connected a > client on CL1.topic and i can confirm that indeed there are duplicated > messages in my mirrored topic. > Kindly suggest how i could avoid these duplicates. Idempotence may not work > correctly during broker shutdown? > In the following you can find my MM2 relative config > clusters = CL1, CL2 > CL1.bootstrap.servers = broker1CL1:9092, broker2CL1:9092, broker3CL1:9092 > CL2.bootstrap.servers = broker1CL2:9092, broker2CL2:9092, broker3CL2:9092 > > PRIM->DSTR.enabled = true > DSTR->PRIM.enabled = true > > CL1.producer.enable.idempotence = true > CL1.producer.acks=all > CL1.producer.max.in.flight.requests.per.connection=5 > CL1.producer.retries=2147483647 > CL1.consumer.isolation.level=read_committed > CL2.producer.enable.idempotence = true > CL2.producer.acks=all > CL2.producer.max.in.flight.requests.per.connection=5 > CL2.producer.retries=2147483647 > CL2.consumer.isolation.level=read_committed > > > Best Regards, >
Re: Mirrormaker 2.0 - duplicates with idempotence enabled
Hi Vangelis, without transaction, Kafka (including MM) is either "at-least" once or "at-most" once. Most use cases prefer "at-least" once https://supergloo.com/kafka/kafka-architecture-delivery/#:~:text=Now%2C%20Kafka%20provides%20%E2%80%9Cat%2D,case%2C%20data%20could%20be%20duplicated.&text=Processing%20in%20batches%20of%20records%20is%20available%20in%20Kafka%20as%20well. On 2021/03/18 08:34:39, Vangelis Typaldos wrote: > Thank you Ning for your response. > > Could you please indicate if there is some statement or specific code part > where is mentioned the “at-least” delivery guarantee? Just for the record. > > Kind Regards, > > Από: Ning Zhang > Αποστολή: Τετάρτη, 17 Μαρτίου 2021 22:39 > Προς: users@kafka.apache.org > Θέμα: Re: Mirrormaker 2.0 - duplicates with idempotence enabled > > Hello Vangelis, > > By default, current MM 2.0 guarantees "at-least" once delivery guarantee, > meaning there will be duplicate messages under some failure scenarios. > > If you prefer to no-message loss, there is a pending PR about MM 2.0 > > https://issues.apache.org/jira/browse/KAFKA-10339 > > On 2021/03/10 07:45:17, Vangelis Typaldos wrote: > > Hi, > > > > I have setup mirrormaker2 (Kafka v.2.6.0) on 2 clusters (CL1,CL2) and the > > mirroring seems to work properly except with an issue with duplicates in > > the following scenario: > > While both clusters are up and running i simulate an incident, stopping one > > by one the brokers of the CL2 cluster. Stopping the first two brokers does > > not generate any issue. All messages of my test topic are mirrored without > > problems on CL1.topic of CL2 cluster. After stopping the last broker, > > obviously will stop mirroring messages in the CL2 side as all brokers are > > down. There is always active a producer that feeds with messages during the > > test on topic of CL1. > > The problem starts on restarting the brokers. After starting the first > > broker i note that some messages (about 5%) are duplicated. I have > > connected a client on CL1.topic and i can confirm that indeed there are > > duplicated messages in my mirrored topic. > > Kindly suggest how i could avoid these duplicates. Idempotence may not work > > correctly during broker shutdown? > > In the following you can find my MM2 relative config > > clusters = CL1, CL2 > > CL1.bootstrap.servers = broker1CL1:9092, broker2CL1:9092, broker3CL1:9092 > > CL2.bootstrap.servers = broker1CL2:9092, broker2CL2:9092, broker3CL2:9092 > > > > PRIM->DSTR.enabled = true > > DSTR->PRIM.enabled = true > > > > CL1.producer.enable.idempotence = true > > CL1.producer.acks=all > > CL1.producer.max.in.flight.requests.per.connection=5 > > CL1.producer.retries=2147483647 > > CL1.consumer.isolation.level=read_committed > > CL2.producer.enable.idempotence = true > > CL2.producer.acks=all > > CL2.producer.max.in.flight.requests.per.connection=5 > > CL2.producer.retries=2147483647 > > CL2.consumer.isolation.level=read_committed > > > > > > Best Regards, > > > >