Re: [DISCUSS] KIP-382: MirrorMaker 2.0

Ryanne Dolan Tue, 27 Nov 2018 13:03:02 -0800

Hey y'all, I'd like you draw your attention to a new section in KIP-382 re
MirrorMaker Clusters:


https://cwiki.apache.org/confluence/display/KAFKA/KIP-382:+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-MirrorMakerClusters

A common concern I hear about using Connect for replication is that all
SourceConnectors in a Connect cluster must use the same target Kafka
cluster, and likewise all SinkConnectors must use the same source Kafka
cluster. In order to use multiple Kafka clusters from Connect, there are
two possible approaches:

1) use an intermediate Kafka cluster, K. SourceConnectors (A, B, C) write
to K and SinkConnectors (X, Y, Z) read from K. This enables flows like A ->
K - > X but means that some topologies require extraneous hops, and means
that K must be scaled to handle records from all sources and sinks.

2) use multiple Connect clusters, one for each target cluster. Each cluster
has multiple SourceConnectors, one for each source cluster. This enables
direct replication of A -> X but means there is a proliferation of Connect
clusters, each of which must be managed separately.

Both options are viable for small deployments involving a small number of
Kafka clusters in a small number of data centers. However, neither is
scalable, especially from an operational standpoint.

KIP-382 now introduces "MirrorMaker clusters", which are distinct from
Connect clusters. A single MirrorMaker cluster provides
"Replication-as-a-Service" among any number of Kafka clusters via a
high-level REST API based on the Connect API. Under the hood, MirrorMaker
sets up Connectors between each pair of Kafka clusters. The REST API
enables on-the-fly reconfiguration of each Connector, including updates to
topic whitelists/blacklists.

To configure MirrorMaker 2.0, you need a configuration file that lists
connection information for each Kafka cluster (broker lists, SSL settings
etc). At a minimum, this looks like:

clusters=us-west, us-east
cluster.us-west.broker.list=us-west-kafka-server:9092
cluster.us-east.broker.list=us-east-kafka-server:9092

You can specify topic whitelists and other connector-level settings here
too, or you can use the REST API to remote-control a running cluster.

I've also updated the KIP with minor changes to bring it in line with the
current implementation.

Looking forward to your feedback, thanks!
Ryanne

On Mon, Nov 19, 2018 at 10:26 PM Ryanne Dolan <ryannedo...@gmail.com> wrote:

> Dan, you've got it right. ACL sync will be done by MM2 automatically
> (unless disabled) according to simple rules:
>
> - If a principal has READ access on a topic in a source cluster, the same
> principal should have READ access on downstream replicated topics ("remote
> topics").
> - Only MM2 has WRITE access on "remote topics".
>
> This covers sync from upstream topics like "topic1" to downstream remote
> topics like "us-west.topic1". What's missing from the KIP, as you point
> out, is ACL sync between normal topics (non-remote). If a consumer has READ
> access to topic1 in an upstream cluster, should it have READ access in
> topic1 in a downstream cluster?
>
> I think the answer generally is no, you don't want to give principals
> blanket permissions across all DCs automatically. For example, I've seen
> scenarios where certain topics are replicated between an internal and
> external Kafka cluster. You don't want to accidentally push ACL changes
> across this boundary.
>
> Moreover, it's clear that MM2 "owns" downstream remote topics like
> "us-west.topic1" -- MM2 is the only producer and the only admin of these
> topics -- so it's natural to have MM2 set the ACL for these topics. But I
> think it would be surprising if MM2 tried to manipulate topics it doesn't
> own. So I think granting permissions across DCs is probably outside MM2's
> purview, but I agree it'd be nice to have tooling to help with this.
>
> Thanks.
> Ryanne
>
> --
> www.ryannedolan.info
>
>
> On Mon, Nov 19, 2018 at 3:58 PM daniel.loci...@gmail.com <
> daniel.loci...@gmail.com> wrote:
>
>> Hi guys,
>>
>> This is an exciting topic. could I have a word here?
>> I understand there are many scenarios that we can apply mirrormaker. I am
>> at the moment working on active/active DC solution using MirrorMaker; our
>> goal is to allow  the clients to failover to connect the other kafka
>> cluster (on the other DC) when an incident happens.
>>
>> To do this, I need:
>> 1 MirrorMaker to replicate the partitioned messages in a sequential order
>> (in timely fashion) to the same partition on the other cluster (also need
>> keep the promise that both clusters creates the same number of partitions
>> for a topic) – so that a consumer can pick up the right order of the latest
>> messages
>> 2 MirrorMaker to replicate the local consumer offset to the other side –
>> so that the consumer knows where is the offset/ latest messages
>> 3 MirrorMaker to provide cycle detection for messages across the DCs.
>>
>> I can see the possibility for Remote Topic to solve all these problems,
>> as long as the consumer can see the remote topic equally as the local
>> topic, i.e. For a consumer which has a permission to consume topic1, on
>> subscribe event it can automatically subscribe both remote.topic1 and
>> local.topic1. First we need to find a way for topic ACL granting for the
>> consumer across the DCs. Secondly the consumer need to be able to subscribe
>> topics with wildcard or suffix. Last but not the least, the consumer has to
>> deal with the timely ordering of the messages from the 2 topics.
>>
>> My understanding is, all of these should be configurable to be turned on
>> or off, to fit for different use cases.
>>
>> Interesting I was going to support topic messages with extra headers of
>> source DC info, for cycle detection…..
>>
>> Looking forward your reply.
>>
>> Regards,
>>
>> Dan
>> On 2018/10/23 19:56:02, Ryanne Dolan <ryannedo...@gmail.com> wrote:
>> > Alex, thanks for the feedback.
>> >
>> > > Would it be possible to utilize the
>> > > Message Headers feature to prevent infinite recursion
>> >
>> > This isn't necessary due to the topic renaming feature which already
>> > prevents infinite recursion.
>> >
>> > If you turn off topic renaming you lose cycle detection, so maybe we
>> could
>> > provide message headers as an optional second mechanism. I'm not
>> opposed to
>> > that idea, but there are ways to improve efficiency if we don't need to
>> > modify or inspect individual records.
>> >
>> > Ryanne
>> >
>> > On Tue, Oct 23, 2018 at 6:06 AM Alex Mironov <alexandr...@gmail.com>
>> wrote:
>> >
>> > > Hey Ryanne,
>> > >
>> > > Awesome KIP, exited to see improvements in MirrorMaker land, I
>> particularly
>> > > like the reuse of Connect framework! Would it be possible to utilize
>> the
>> > > Message Headers feature to prevent infinite recursion? For example,
>> MM2
>> > > could stamp every message with a special header payload (e.g.
>> > > MM2="cluster-name-foo") so in case another MM2 instance sees this
>> message
>> > > and it is configured to replicate data into "cluster-name-foo" it
>> would
>> > > just skip it instead of replicating it back.
>> > >
>> > > On Sat, Oct 20, 2018 at 5:48 AM Ryanne Dolan <ryannedo...@gmail.com>
>> > > wrote:
>> > >
>> > > > Thanks Harsha. Done.
>> > > >
>> > > > On Fri, Oct 19, 2018 at 1:03 AM Harsha Chintalapani <
>> ka...@harsha.io>
>> > > > wrote:
>> > > >
>> > > > > Ryanne,
>> > > > >        Makes sense. Can you please add this under rejected
>> alternatives
>> > > > so
>> > > > > that everyone has context on why it  wasn’t picked.
>> > > > >
>> > > > > Thanks,
>> > > > > Harsha
>> > > > > On Oct 18, 2018, 8:02 AM -0700, Ryanne Dolan <
>> ryannedo...@gmail.com>,
>> > > > > wrote:
>> > > > >
>> > > > > Harsha, concerning uReplicator specifically, the project is a
>> major
>> > > > > inspiration for MM2, but I don't think it is a good foundation for
>> > > > anything
>> > > > > included in Apache Kafka. uReplicator uses Helix to solve
>> problems that
>> > > > > Connect also solves, e.g. REST API, live configuration changes,
>> cluster
>> > > > > management, coordination etc. This also means that existing
>> tooling,
>> > > > > dashboards etc that work with Connectors do not work with
>> uReplicator,
>> > > > and
>> > > > > any future tooling would need to treat uReplicator as a special
>> case.
>> > > > >
>> > > > > Ryanne
>> > > > >
>> > > > > On Wed, Oct 17, 2018 at 12:30 PM Ryanne Dolan <
>> ryannedo...@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > >> Harsha, yes I can do that. I'll update the KIP accordingly,
>> thanks.
>> > > > >>
>> > > > >> Ryanne
>> > > > >>
>> > > > >> On Wed, Oct 17, 2018 at 12:18 PM Harsha <ka...@harsha.io> wrote:
>> > > > >>
>> > > > >>> Hi Ryanne,
>> > > > >>>                Thanks for the KIP. I am also curious about why
>> not
>> > > use
>> > > > >>> the uReplicator design as the foundation given it alreadys
>> resolves
>> > > > some of
>> > > > >>> the fundamental issues in current MIrrorMaker, updating the
>> confifgs
>> > > > on the
>> > > > >>> fly and running the mirror maker agents in a worker model which
>> can
>> > > > >>> deployed in mesos or container orchestrations.  If possible can
>> you
>> > > > >>> document in the rejected alternatives what are missing parts
>> that
>> > > made
>> > > > you
>> > > > >>> to consider a new design from ground up.
>> > > > >>>
>> > > > >>> Thanks,
>> > > > >>> Harsha
>> > > > >>>
>> > > > >>> On Wed, Oct 17, 2018, at 8:34 AM, Ryanne Dolan wrote:
>> > > > >>> > Jan, these are two separate issues.
>> > > > >>> >
>> > > > >>> > 1) consumer coordination should not, ideally, involve
>> unreliable or
>> > > > >>> slow
>> > > > >>> > connections. Naively, a KafkaSourceConnector would coordinate
>> via
>> > > the
>> > > > >>> > source cluster. We can do better than this, but I'm deferring
>> this
>> > > > >>> > optimization for now.
>> > > > >>> >
>> > > > >>> > 2) exactly-once between two clusters is mind-bending. But
>> keep in
>> > > > mind
>> > > > >>> that
>> > > > >>> > transactions are managed by the producer, not the consumer. In
>> > > fact,
>> > > > >>> it's
>> > > > >>> > the producer that requests that offsets be committed for the
>> > > current
>> > > > >>> > transaction. Obviously, these offsets are committed in
>> whatever
>> > > > >>> cluster the
>> > > > >>> > producer is sending to.
>> > > > >>> >
>> > > > >>> > These two issues are closely related. They are both resolved
>> by not
>> > > > >>> > coordinating or committing via the source cluster. And in
>> fact,
>> > > this
>> > > > >>> is the
>> > > > >>> > general model of SourceConnectors anyway, since most
>> > > SourceConnectors
>> > > > >>> > _only_ have a destination cluster.
>> > > > >>> >
>> > > > >>> > If there is a lot of interest here, I can expound further on
>> this
>> > > > >>> aspect of
>> > > > >>> > MM2, but again I think this is premature until this first KIP
>> is
>> > > > >>> approved.
>> > > > >>> > I intend to address each of these in separate KIPs following
>> this
>> > > > one.
>> > > > >>> >
>> > > > >>> > Ryanne
>> > > > >>> >
>> > > > >>> > On Wed, Oct 17, 2018 at 7:09 AM Jan Filipiak <
>> > > > jan.filip...@trivago.com
>> > > > >>> >
>> > > > >>> > wrote:
>> > > > >>> >
>> > > > >>> > > This is not a performance optimisation. Its a fundamental
>> design
>> > > > >>> choice.
>> > > > >>> > >
>> > > > >>> > >
>> > > > >>> > > I never really took a look how streams does exactly once.
>> (its a
>> > > > trap
>> > > > >>> > > anyways and you usually can deal with at least once
>> donwstream
>> > > > pretty
>> > > > >>> > > easy). But I am very certain its not gonna get somewhere if
>> > > offset
>> > > > >>> > > commit and record produce cluster are not the same.
>> > > > >>> > >
>> > > > >>> > > Pretty sure without this _design choice_ you can skip on
>> that
>> > > > exactly
>> > > > >>> > > once already
>> > > > >>> > >
>> > > > >>> > > Best Jan
>> > > > >>> > >
>> > > > >>> > > On 16.10.2018 18:16, Ryanne Dolan wrote:
>> > > > >>> > > >  >  But one big obstacle in this was
>> > > > >>> > > > always that group coordination happened on the source
>> cluster.
>> > > > >>> > > >
>> > > > >>> > > > Jan, thank you for bringing up this issue with legacy
>> > > > MirrorMaker.
>> > > > >>> I
>> > > > >>> > > > totally agree with you. This is one of several problems
>> with
>> > > > >>> MirrorMaker
>> > > > >>> > > > I intend to solve in MM2, and I already have a design and
>> > > > >>> prototype that
>> > > > >>> > > > solves this and related issues. But as you pointed out,
>> this
>> > > KIP
>> > > > is
>> > > > >>> > > > already rather complex, and I want to focus on the core
>> feature
>> > > > set
>> > > > >>> > > > rather than performance optimizations for now. If we can
>> agree
>> > > on
>> > > > >>> what
>> > > > >>> > > > MM2 looks like, it will be very easy to agree to improve
>> its
>> > > > >>> performance
>> > > > >>> > > > and reliability.
>> > > > >>> > > >
>> > > > >>> > > > That said, I look forward to your support on a subsequent
>> KIP
>> > > > that
>> > > > >>> > > > addresses consumer coordination and rebalance issues. Stay
>> > > tuned!
>> > > > >>> > > >
>> > > > >>> > > > Ryanne
>> > > > >>> > > >
>> > > > >>> > > > On Tue, Oct 16, 2018 at 6:58 AM Jan Filipiak <
>> > > > >>> jan.filip...@trivago.com
>> > > > >>> > > > <mailto:jan.filip...@trivago.com>> wrote:
>> > > > >>> > > >
>> > > > >>> > > >     Hi,
>> > > > >>> > > >
>> > > > >>> > > >     Currently MirrorMaker is usually run collocated with
>> the
>> > > > target
>> > > > >>> > > >     cluster.
>> > > > >>> > > >     This is all nice and good. But one big obstacle in
>> this was
>> > > > >>> > > >     always that group coordination happened on the source
>> > > > cluster.
>> > > > >>> So
>> > > > >>> > > when
>> > > > >>> > > >     then network was congested, you sometimes loose group
>> > > > >>> membership and
>> > > > >>> > > >     have to rebalance and all this.
>> > > > >>> > > >
>> > > > >>> > > >     So one big request from we would be the support of
>> having
>> > > > >>> > > coordination
>> > > > >>> > > >     cluster != source cluster.
>> > > > >>> > > >
>> > > > >>> > > >     I would generally say a LAN is better than a WAN for
>> doing
>> > > > >>> group
>> > > > >>> > > >     coordinaton and there is no reason we couldn't have a
>> group
>> > > > >>> consuming
>> > > > >>> > > >     topics from a different cluster and committing
>> offsets to
>> > > > >>> another
>> > > > >>> > > >     one right?
>> > > > >>> > > >
>> > > > >>> > > >     Other than that. It feels like the KIP has too much
>> > > features
>> > > > >>> where
>> > > > >>> > > many
>> > > > >>> > > >     of them are not really wanted and counter productive
>> but I
>> > > > >>> will just
>> > > > >>> > > >     wait and see how the discussion goes.
>> > > > >>> > > >
>> > > > >>> > > >     Best Jan
>> > > > >>> > > >
>> > > > >>> > > >
>> > > > >>> > > >     On 15.10.2018 18:16, Ryanne Dolan wrote:
>> > > > >>> > > >      > Hey y'all!
>> > > > >>> > > >      >
>> > > > >>> > > >      > Please take a look at KIP-382:
>> > > > >>> > > >      >
>> > > > >>> > > >      >
>> > > > >>> > > >
>> > > > >>> > >
>> > > > >>>
>> > > >
>> > >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0
>> > > > >>> > > >      >
>> > > > >>> > > >      > Thanks for your feedback and support.
>> > > > >>> > > >      >
>> > > > >>> > > >      > Ryanne
>> > > > >>> > > >      >
>> > > > >>> > > >
>> > > > >>> > >
>> > > > >>>
>> > > > >>
>> > > >
>> > >
>> > >
>> > > --
>> > > Best,
>> > > Alex Mironov
>> > >
>> >
>>
>

Re: [DISCUSS] KIP-382: MirrorMaker 2.0

Reply via email to