Thanks very much for the response.

Please could you elaborate a bit more about  "I'd
arc in that direction. Instead of migrating A->B->C->D..., active/active is
more like having one big cluster".

Another thing that I would like to share is that currently my consumers
only consumer from one topic so the fact of introducing MM2 will impact
them.
Any suggestion in this regard would be greatly appreciated

Thanks in advance again!


On Mon, Feb 10, 2020 at 9:40 PM Ryanne Dolan <ryannedo...@gmail.com> wrote:

> Hello, sounds like you have this all figured out actually. A couple notes:
>
> > For now, we just need to handle DR requirements, i.e., we would not need
> active-active
>
> If your infrastructure is sufficiently advanced, active/active can be a lot
> easier to manage than active/standby. If you are starting from scratch I'd
> arc in that direction. Instead of migrating A->B->C->D..., active/active is
> more like having one big cluster.
>
> > secondary.primary.topic1
>
> I'd recommend using regex subscriptions where possible, so that apps don't
> need to worry about these potentially complex topic names.
>
> > An additional question. If the topic is compacted, i.e.., the topic keeps
> > forever, does switchover operations would imply add an additional path in
> > the topic name?
>
> I think that's right. You could always clean things up manually, but
> migrating between clusters a bunch of times would leave a trail of
> replication hops.
>
> Also, you might look into implementing a custom ReplicationPolicy. For
> example, you could squash "secondary.primary.topic1" into something shorter
> if you like.
>
> Ryanne
>
> On Mon, Feb 10, 2020 at 1:24 PM benitocm <benit...@gmail.com> wrote:
>
> > Hi,
> >
> > After having a look to the talk
> >
> >
> https://www.confluent.io/kafka-summit-lon19/disaster-recovery-with-mirrormaker-2-0
> > and the
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382
> > I am trying to understand how I would use it
> > in the setup that I have. For now, we just need to handle DR
> requirements,
> > i.e., we would not need active-active
> >
> > My requirements, more or less, are the following:
> >
> > 1) Currently, we have just one Kafka cluster "primary" where all the
> > producers are producing to and where all the consumers are consuming
> from.
> > 2) In case "primary" crashes, we would need to have other Kafka cluster
> > "secondary" where we will move all the producer and consumers and keep
> > working.
> > 3) Once "primary" is recovered, we would need to move to it again (as we
> > were in #1)
> >
> > To fullfill #2, I have thought to have a new Kafka cluster "secondary"
> and
> > setup a replication procedure using MM2. However, it is not clear to me
> how
> > to proceed.
> >
> > I would describe the high level details so you guys can point my
> > misconceptions:
> >
> > A) Initial situation. As in the example of the KIP-382, in the primary
> > cluster, we will have a local topic: "topic1" where the producers will
> > produce to and the consumers will consume from. MM2 will create in  the
> > primary the remote topic "primary.topic1" where the local topic in the
> > primary will be replicated. In addition, the consumer group information
> of
> > primary will be also replicated.
> >
> > B) Kafka primary cluster is not available. Producers are moved to produce
> > into the topic1 that it was manually created. In addition, consumers need
> > to connect to
> > secondary to consume the local topic "topic1" where the producers are now
> > producing and from the remote topic  "primary.topic1" where the producers
> > were producing before, i.e., consumers will need to aggregate.This is so
> > because some consumers could have lag so they will need to consume from
> > both. In this situation, local topic "topic1" in the secondary will be
> > modified with new messages and will be consumed (its consumption
> > information will also change) but the remote topic "primary.topic1" will
> > not receive new messages but it will be consumed  (its consumption
> > information will change)
> >
> > At this point, my conclusion is that consumers needs to consume from both
> > topics (the new messages produced in the local topic and the old messages
> > for consumers that had a lag)
> >
> > C) primary cluster is recovered (here is when the things get complicated
> > for me). In the talk, the new primary is renamed a primary-2 and the MM2
> is
> > configured to active-active replication.
> > The result is the following. The secondary cluster will end up with a new
> > remote topic (primary-2.topic1) that will contain a replica of the new
> > topic1 created in the primary-2 cluster. The primary-2 cluster will have
> 3
> > topics. "topic1" will be a new topic where in the near future producers
> > will produce, "secondary.topic1" contains the replica of the local topic
> > "topic1" in the secondary and "secondary.primary.topic1" that is "topic1"
> > of the old primary (got through the secondary).
> >
> > D) Once all the replicas are in sync, producers and consumers will be
> moved
> > to the primary-2. Producers will produce to local topic "topic1" of
> > primary-2 cluster. The consumers
> > will connect to primary-2 to consume from "topic1" (new messages that
> come
> > in), "secondary.topic1" (messages produced during the outage) and from
> > "secondary.primary.topic1" (old messages)
> >
> > If topics have a retention time, e.g. 7 days, we could remove
> > "secondary.primary.topic1" after a few days, leaving the situation as at
> > the beginning. However, if another problem happens in the middle, the
> > number of topics could be a little difficult to handle.
> >
> > An additional question. If the topic is compacted, i.e.., the topic keeps
> > forever, does switchover operations would imply add an additional path in
> > the topic name?
> >
> > I would appreciate some guidance with this.
> >
> > Regards
> >
>

Reply via email to