Re: [DISCUSS] KIP-382: MirrorMaker 2.0

Ryanne Dolan Fri, 07 Dec 2018 12:19:01 -0800

Michael, thanks for the comments!

>  would like to see support for this to be done by hops, as well [...]
This then allows ring (hops = number of brokers in the ring), mesh (every
cluster interconnected so hop=1), or even a tree (more fine grained setup)
cluster topology.


That's a good idea, though we can do this at the topic level without
tagging individual records. A max.hop of 1 would mean "A.topic1" is
allowed, but not "B.A.topic1". I think the default behavior would need to
be max.hops = 1 to avoid unexpectedly creating a bunch of D.C.B.A... topics
when you create a fully-connected mesh topology.

Looking ahead a bit, I can imagine an external tool computing the spanning
tree of topics among a set of clusters based on inter-cluster replication
lag, and setting up MM2 accordingly. But that's probably outside the scope
of this KIP :)

>  ...standalone MirrorMaker connector...
>     ./bin/kafka-mirror-maker-2.sh --consumer consumer.properties
--producer producer.properties

Eventually, I'd like MM2 to completely replace legacy MM, including the
./bin/kafka-mirror-maker.sh script. In the meantime, it's a good idea to
include a standalone driver. Something like
./bin/connect-mirror-maker-standalone.sh with the same high-level
configuration file. I'll do that, thanks.

> I see no section on providing support for mirror maker Handlers, today
people can add handlers to have a little extra custom logic if needed, and
the handler api is public today so should be supported going forwards so
people are not on mass re-writing these.

Great point. Connect offers single-message transformations and converters
for this purpose, but I agree that we should honor the existing API if
possible. This might be as easy as providing an adapter class between
connect's Transformation and mirror-maker's Handler. Maybe file a Jira
ticket to track this?

Really appreciate your feedback!

Ryanne


On Thu, Dec 6, 2018 at 7:03 PM Michael Pearce <[email protected]> wrote:

> Re hops to stop the cycle and to allow a range of multi cluster
> topologies, see https://www.rabbitmq.com/federated-exchanges.html where
> very similar was done in rabbit.
>
>
>
> On 12/7/18, 12:47 AM, "Michael Pearce" <[email protected]> wrote:
>
>     Nice proposal.
>
>     Some comments.
>
>
>     On the section around cycle detection.
>
>     I would like to see support for this to be done by hops, as well e.g.
> using approach is to use a header for the number of hops, as the mm2
> replicates it increases the hop count and you can make the mm2 configurable
> to only produce messages onwards where hops are less than x.
>     This then allows ring (hops = number of brokers in the ring), mesh
> (every cluster interconnected so hop=1), or even a tree (more fine grained
> setup) cluster topology.
>     FYI we do this currently with the current mirror maker, using a custom
> handler.
>
>
>     On the section around running a standalone MirrorMaker connector
>
>     I would suggest making this as easy to run as the mirrormakers are
> today, with a simple single sh script.
>     I assume this is what is proposed in section "Running MirrorMaker in
> legacy mode" but I would even do this before MM would be removed, with a -2
> varient.
>     e.g.
>     ./bin/kafka-mirror-maker-2.sh --consumer consumer.properties
> --producer producer.properties
>
>     Lastly
>
>     I see no section on providing support for mirror maker Handlers, today
> people can add handlers to have a little extra custom logic if needed, and
> the handler api is public today so should be supported going forwards so
> people are not on mass re-writing these.
>
>     On 12/5/18, 5:36 PM, "Ryanne Dolan" <[email protected]> wrote:
>
>         Sönke,
>
>         > The only thing that I could come up with is the limitation to a
> single
>         offset commit interval
>
>         Yes, and other internal properties, e.g. those used by the internal
>         consumers and producers, which, granted, probably are not often
> changed
>         from their defaults, but that apply to Connectors across the
> entire cluster.
>
>         Ryanne
>
>         On Wed, Dec 5, 2018 at 3:21 AM Sönke Liebau
>         <[email protected]> wrote:
>
>         > Hi Ryanne,
>         >
>         > when you say "Currently worker configs apply across the entire
> cluster,
>         > which is limiting even for use-cases involving a single Kafka
> cluster.",
>         > may I ask you to elaborate on those limitations a little?
>         > The only thing that I could come up with is the limitation to a
> single
>         > offset commit interval value for all running connectors.
>         > Maybe also the limitation to shared config providers..
>         >
>         > But you sound like you had painful experiences with this before,
> maybe
>         > you'd like to share the burden :)
>         >
>         > Best regards,
>         > Sönke
>         >
>         > On Wed, Dec 5, 2018 at 5:15 AM Ryanne Dolan <
> [email protected]> wrote:
>         >
>         > > Sönke,
>         > >
>         > > I think so long as we can keep the differences at a very high
> level (i.e.
>         > > the "control plane"), there is little downside to MM2 and
> Connect
>         > > coexisting. I do expect them to converge to some extent, with
> features
>         > from
>         > > MM2 being pulled into Connect whenever this is possible
> without breaking
>         > > things.
>         > >
>         > > I could definitely see your idea re hierarchies or groups of
> connectors
>         > > being useful outside MM2. Currently "worker configs" apply
> across the
>         > > entire cluster, which is limiting even for use-cases involving
> a single
>         > > Kafka cluster. If Connect supported multiple workers in the
> same cluster,
>         > > it would start to look a lot like a MM2 cluster.
>         > >
>         > > Ryanne
>         > >
>         > > On Tue, Dec 4, 2018 at 3:26 PM Sönke Liebau
>         > > <[email protected]> wrote:
>         > >
>         > > > Hi Ryanne,
>         > > >
>         > > > thanks for your response!
>         > > >
>         > > > It seems like you have already done a lot of investigation
> into the
>         > > > existing code and the solution design and all of what you
> write makes
>         > > sense
>         > > > to me. Would it potentially be worth adding this to the KIP,
> now that
>         > you
>         > > > had to write it up because of me anyway?
>         > > >
>         > > > However, I am afraid that I am still not entirely convinced
> of the
>         > > > fundamental benefit this provides over an extended Connect
> that has the
>         > > > following functionality:
>         > > > - allow for organizing connectors into a hierarchical
> structure -
>         > > > "clusters/us-west/..."
>         > > > - allow defining external Kafka clusters to be used by
> Source and Sink
>         > > > connectors instead of the local cluster
>         > > >
>         > > > Personally I think both of these features are useful
> additions to
>         > > Connect,
>         > > > I'll address both separately below.
>         > > >
>         > > > Allowing to structure connectors in a hierarchy
>         > > > Organizing running connectors will grow more important as
> corporate
>         > > > customers adapt Connect and installations grow in size.
> Additionally
>         > this
>         > > > could be useful for ACLs in case they are ever added to
> Connect, as you
>         > > > could allow specific users access only to specific
> namespaces (and
>         > until
>         > > > ACLs are added it would facilitate using a reverse proxy for
> the same
>         > > > effect).
>         > > >
>         > > > Allow accessing multiple external clusters
>         > > > The reasoning for this feature is pretty much the same as
> for a central
>         > > > Mirror Maker cluster, if a company has multiple clusters for
> whatever
>         > > > reason but wants to have ingest centralized in one system
> aka one
>         > Connect
>         > > > cluster they would need the ability to read from and write
> to an
>         > > arbitrary
>         > > > number of Kafka clusters.
>         > > > I haven't really looked at the code, just poked around a
> couple of
>         > > minutes,
>         > > > but it appears like this could be done with fairly low
> effort. My
>         > general
>         > > > idea would be to leave the existing configuration options
> untouched -
>         > > > Connect will always need a "primary" cluster that is used
> for storage
>         > of
>         > > > internal data (config, offsets, status) there is no need to
> break
>         > > existing
>         > > > configs. But additionally allow adding named extra clusters
> by
>         > specifying
>         > > > options like
>         > > >   external.sales_cluster.bootstrap_servers=...
>         > > >   external.sales_cluster.ssl.keystore.location=...
>         > > >   external.marketing_cluster.bootstrap_servers=...
>         > > >
>         > > > The code for status, offset and config storage is mostly
> isolated in
>         > the
>         > > > Kafka[Offset|Status|Config]BackingStore classes and could
> remain pretty
>         > > > much unchanged.
>         > > >
>         > > > Producer and consumer creation for Tasks is done in the
> Worker as of
>         > > > KAFKA-7551 and is isolated in two functions. We could add a
> two more
>         > > > functions with an extra argument for the external cluster
> name to be
>         > used
>         > > > and return fitting consumers/producers.
>         > > > The source and sink config would then simply gain an
> optional setting
>         > to
>         > > > specify the cluster name.
>         > > >
>         > > > I am very sure that I am missing a few large issues with
> these ideas,
>         > I'm
>         > > > mostly back-of-the-napkin designing here, but it might be
> worth a
>         > second
>         > > > look.
>         > > >
>         > > > Once we decide to diverge into two clusters: MirrorMaker and
> Connect, I
>         > > > think realistically the chance of those two ever being
> merged again
>         > > because
>         > > > they grow back together is practically zero - hence my
> hesitation.
>         > > >
>         > > > ----
>         > > >
>         > > > All of that being said, I am absolutely happy to agree to
> disagree, I
>         > > think
>         > > > to a certain extent this is down to a question of personal
>         > > > style/preference. And as this is your baby and you have put
> a lot more
>         > > > effort and thought into it than I ever will I'll shut up now
> :)
>         > > >
>         > > > Again, thanks for all your good work!
>         > > >
>         > > > Best regards,
>         > > > Sönke
>         > > >
>         > > > On Fri, Nov 30, 2018 at 9:00 PM Ryanne Dolan <
> [email protected]>
>         > > > wrote:
>         > > >
>         > > > > Thanks Sönke.
>         > > > >
>         > > > > > it just feels to me like an awful lot of Connect
> functionality
>         > would
>         > > > need
>         > > > > to be reimplemented or at least wrapped
>         > > > >
>         > > > > Connect currently has two drivers, ConnectDistributed and
>         > > > > ConnectStandalone. Both set up a Herder, which manages
> Workers. I've
>         > > > > implemented a third driver which sets up multiple Herders,
> one for
>         > each
>         > > > > Kafka cluster as specified in a config file. From the
> Herder level
>         > > down,
>         > > > > nothing is changed or duplicated -- it's just Connect.
>         > > > >
>         > > > > For the REST API, Connect wraps a Herder in a RestServer
> class, which
>         > > > > creates a Jetty server with a few JAX-RS resources. One of
> these
>         > > > resources
>         > > > > is ConnectorsResource, which is the real meat of the REST
> API,
>         > enabling
>         > > > > start, stop, creation, deletion, and configuration of
> Connectors.
>         > > > >
>         > > > > I've added MirrorRestServer, which wraps a set of Herders
> instead of
>         > > one.
>         > > > > The server exposes a single resource, ClustersResource,
> which is
>         > only a
>         > > > few
>         > > > > lines of code:
>         > > > >
>         > > > > @GET
>         > > > > @Path("/")
>         > > > > public Collection<String> listClusters() {
>         > > > >   return clusters.keySet();
>         > > > > }
>         > > > >
>         > > > > @Path("/{cluster}")
>         > > > > public ConnectorsResource
>         > getConnectorsForCluster(@PathParam("cluster")
>         > > > > cluster) {
>         > > > >   return new ConnectorsResource(clusters.get(cluster));
>         > > > > }
>         > > > >
>         > > > > (simplified a bit and subject to change)
>         > > > >
>         > > > > The ClustersResource defers to the existing
> ConnectorsResource, which
>         > > > again
>         > > > > is most of the Connect API. With this in place, I can make
> requests
>         > > like:
>         > > > >
>         > > > > GET /clusters
>         > > > >
>         > > > > GET /clusters/us-west/connectors
>         > > > >
>         > > > > PUT /clusters/us-west/connectors/us-east/config
>         > > > > { "topics" : "topic1" }
>         > > > >
>         > > > > etc.
>         > > > >
>         > > > > So on the whole, very little code is involved in
> implementing
>         > > > "MirrorMaker
>         > > > > clusters". I won't rule out adding additional features on
> top of this
>         > > > basic
>         > > > > API, but nothing should require re-implementing what is
> already in
>         > > > Connect.
>         > > > >
>         > > > > > Wouldn't it be a viable alternative to look into
> extending Connect
>         > > > itself
>         > > > >
>         > > > > Maybe Connect will evolve to the point where Connect
> clusters and
>         > > > > MirrorMaker clusters are indistinguishable, but I think
> this is
>         > > unlikely,
>         > > > > since really no use-case outside replication would benefit
> from the
>         > > added
>         > > > > complexity. Moreover, I think support for multiple Kafka
> clusters
>         > would
>         > > > be
>         > > > > hard to add without significant changes to the existing
> APIs and
>         > > configs,
>         > > > > which all assume a single Kafka cluster. I think
> Connect-as-a-Service
>         > > and
>         > > > > Replication-as-a-Service are sufficiently different
> use-cases that we
>         > > > > should expect the APIs and configuration files to be at
> least
>         > slightly
>         > > > > different, even if both use the same framework underneath.
> That
>         > said, I
>         > > > do
>         > > > > plan to contribute a few improvements to the Connect
> framework in
>         > > support
>         > > > > of MM2 -- just nothing within the scope of the current KIP.
>         > > > >
>         > > > > Thanks again!
>         > > > > Ryanne
>         > > > >
>         > > > >
>         > > > > On Fri, Nov 30, 2018 at 3:47 AM Sönke Liebau
>         > > > > <[email protected]> wrote:
>         > > > >
>         > > > > > Hi Ryanne,
>         > > > > >
>         > > > > > thanks. I missed the remote to remote replication
> scenario in my
>         > > train
>         > > > of
>         > > > > > thought, you are right.
>         > > > > >
>         > > > > > That being said I have to admit that I am not yet fully
> on board
>         > with
>         > > > the
>         > > > > > concept, sorry. But I might just be misunderstanding
> what your
>         > > > intention
>         > > > > > is. Let me try and explain what I think it is you are
> trying to do
>         > > and
>         > > > > why
>         > > > > > I am on the fence about that and take it from there.
>         > > > > >
>         > > > > > You want to create an extra mirrormaker driver class
> which will
>         > take
>         > > > > > multiple clusters as configuration options. Based on
> these clusters
>         > > it
>         > > > > will
>         > > > > > then reuse the connect workers and create as many as
> necessary to
>         > be
>         > > > able
>         > > > > > to replicate to/from each of those configured clusters.
> It will
>         > then
>         > > > > > expose a rest api (since you stated subset of Connect
> rest api I
>         > > assume
>         > > > > it
>         > > > > > will be a new / own one?)  that allows users to send
> requests like
>         > > > > > "replicate topic a from cluster 1 to cluster 1" and
> start a
>         > connector
>         > > > on
>         > > > > > the relevant worker that can offer this "route".
>         > > > > > This can be extended to a cluster by starting mirror
> maker drivers
>         > on
>         > > > > other
>         > > > > > nodes with the same config and it would offer all the
> connect
>         > > features
>         > > > of
>         > > > > > balancing restarting in case of failure etc.
>         > > > > >
>         > > > > > If this understanding is correct then it just feels to
> me like an
>         > > awful
>         > > > > lot
>         > > > > > of Connect functionality would need to be reimplemented
> or at least
>         > > > > > wrapped, which potentially could mean additional effort
> for
>         > > maintaining
>         > > > > and
>         > > > > > extending Connect down the line. Wouldn't it be a viable
>         > alternative
>         > > to
>         > > > > > look into extending Connect itself to allow defining
> "remote
>         > > clusters"
>         > > > > > which can then be specified in the connector config to
> be used
>         > > instead
>         > > > of
>         > > > > > the local cluster? I imagine that change itself would
> not be too
>         > > > > extensive,
>         > > > > > the main effort would probably be in coming up with a
> sensible
>         > config
>         > > > > > structure and ensuring backwards compatibility with
> existing
>         > > connector
>         > > > > > configs.
>         > > > > > This would still allow to use a regular Connect cluster
> for an
>         > > > arbitrary
>         > > > > > number of clusters, thus still having a dedicated
> MirrorMaker
>         > cluster
>         > > > by
>         > > > > > running only MirrorMaker Connectors in there if you want
> the
>         > > > isolation. I
>         > > > > > agree that it would not offer the level of abstraction
> around
>         > > > replication
>         > > > > > that your concept would enable to implement, but I think
> if would
>         > be
>         > > > far
>         > > > > > less implementation and maintenance effort.
>         > > > > >
>         > > > > > But again, all of that is based on my, potentially
> flawed,
>         > > > understanding
>         > > > > of
>         > > > > > your proposal, please feel free to correct me :)
>         > > > > >
>         > > > > > Best regards,
>         > > > > > Sönke
>         > > > > >
>         > > > > > On Fri, Nov 30, 2018 at 1:39 AM Ryanne Dolan <
>         > [email protected]>
>         > > > > > wrote:
>         > > > > >
>         > > > > > > Sönke, thanks for the feedback!
>         > > > > > >
>         > > > > > > >  the renaming policy [...] can be disabled [...] The
> KIP itself
>         > > > does
>         > > > > > not
>         > > > > > > mention this
>         > > > > > >
>         > > > > > > Good catch. I've updated the KIP to call this out.
>         > > > > > >
>         > > > > > > > "MirrorMaker clusters" I am not sure I fully
> understand the
>         > issue
>         > > > you
>         > > > > > > are trying to solve
>         > > > > > >
>         > > > > > > MirrorMaker today is not scalable from an operational
>         > perspective.
>         > > > > Celia
>         > > > > > > Kung at LinkedIn does a great job of explaining this
> problem [1],
>         > > > which
>         > > > > > has
>         > > > > > > caused LinkedIn to drop MirrorMaker in favor of
> Brooklin. With
>         > > > > Brooklin,
>         > > > > > a
>         > > > > > > single cluster, single API, and single UI controls
> replication
>         > > flows
>         > > > > for
>         > > > > > an
>         > > > > > > entire data center. With MirrorMaker 2.0, the vision
> is much the
>         > > > same.
>         > > > > > >
>         > > > > > > If your data center consists of a small number of
> Kafka clusters
>         > > and
>         > > > an
>         > > > > > > existing Connect cluster, it might make more sense to
> re-use the
>         > > > > Connect
>         > > > > > > cluster with MirrorSource/SinkConnectors. There's
> nothing wrong
>         > > with
>         > > > > this
>         > > > > > > approach for small deployments, but this model also
> doesn't
>         > scale.
>         > > > This
>         > > > > > is
>         > > > > > > because Connect clusters are built around a single
> Kafka cluster
>         > --
>         > > > > what
>         > > > > > I
>         > > > > > > call the "primary" cluster -- and all Connectors in
> the cluster
>         > > must
>         > > > > > either
>         > > > > > > consume from or produce to this single cluster. If you
> have more
>         > > than
>         > > > > one
>         > > > > > > "active" Kafka cluster in each data center, you'll end
> up needing
>         > > > > > multiple
>         > > > > > > Connect clusters there as well.
>         > > > > > >
>         > > > > > > The problem with Connect clusters for replication is
> way less
>         > > severe
>         > > > > > > compared to legacy MirrorMaker. Generally you need one
> Connect
>         > > > cluster
>         > > > > > per
>         > > > > > > active Kafka cluster. As you point out, MM2's
> SinkConnector means
>         > > you
>         > > > > can
>         > > > > > > get away with a single Connect cluster for topologies
> that center
>         > > > > around
>         > > > > > a
>         > > > > > > single primary cluster. But each Connector within each
> Connect
>         > > > cluster
>         > > > > > must
>         > > > > > > be configured independently, with no high-level view
> of your
>         > > > > replication
>         > > > > > > flows within and between data centers.
>         > > > > > >
>         > > > > > > With MirrorMaker 2.0, a single MirrorMaker cluster
> manages
>         > > > replication
>         > > > > > > across any number of Kafka clusters. Much like
> Brooklin, MM2 does
>         > > the
>         > > > > > work
>         > > > > > > of setting up connectors between clusters as needed.
> This
>         > > > > > > Replication-as-a-Service is a huge win for larger
> deployments, as
>         > > > well
>         > > > > as
>         > > > > > > for organizations that haven't adopted Connect.
>         > > > > > >
>         > > > > > > [1]
>         > > > > > >
>         > > > > >
>         > > > >
>         > > >
>         > >
>         >
> https://www.slideshare.net/ConfluentInc/more-data-more-problems-scaling-kafkamirroring-pipelines-at-linkedin
>         > > > > > >
>         > > > > > > Keep the questions coming! Thanks.
>         > > > > > > Ryanne
>         > > > > > >
>         > > > > > > On Thu, Nov 29, 2018 at 3:30 AM Sönke Liebau <
>         > > > > [email protected]
>         > > > > > >
>         > > > > > > wrote:
>         > > > > > >
>         > > > > > >> Hi Ryanne,
>         > > > > > >>
>         > > > > > >> first of all, thanks for the KIP, great work overall
> and much
>         > > > needed I
>         > > > > > >> think!
>         > > > > > >>
>         > > > > > >> I have a small comment on the renaming policy, in one
> of the
>         > mails
>         > > > on
>         > > > > > >> this thread you mention that this can be disabled (to
> replicate
>         > > > topic1
>         > > > > > in
>         > > > > > >> cluster A as topic1 on cluster B I assume). The KIP
> itself does
>         > > not
>         > > > > > mention
>         > > > > > >> this, from reading just the KIP one might get the
> assumption
>         > that
>         > > > > > renaming
>         > > > > > >> is mandatory. It might be useful to add a sentence or
> two around
>         > > > > > renaming
>         > > > > > >> policies and what is possible here. I assume you
> intend to make
>         > > > these
>         > > > > > >> pluggable?
>         > > > > > >>
>         > > > > > >> Regarding the latest addition of "MirrorMaker
> clusters" I am not
>         > > > sure
>         > > > > I
>         > > > > > >> fully understand the issue you are trying to solve
> and what
>         > > exactly
>         > > > > > these
>         > > > > > >> scripts will do - but that may just me being dense
> about it :)
>         > > > > > >> I understand the limitation to a single source and
> target
>         > cluster
>         > > > that
>         > > > > > >> Connect imposes, but isn't this worked around by the
> fact that
>         > you
>         > > > > have
>         > > > > > >> MirrorSource- and MirrorSinkConnectors and one part
> of the
>         > > equation
>         > > > > will
>         > > > > > >> always be under your control?
>         > > > > > >> The way I understood your intention was that there is
> a
>         > (regular,
>         > > > not
>         > > > > > MM)
>         > > > > > >> Connect Cluster somewhere next to a Kafka Cluster A
> and if you
>         > > > deploy
>         > > > > a
>         > > > > > >> MirrorSourceTask to that it will read messages from a
> remote
>         > > > cluster B
>         > > > > > and
>         > > > > > >> replicate them into the local cluster A. If you
> deploy a
>         > > > > MirrorSinkTask
>         > > > > > it
>         > > > > > >> will read from local cluster A and replicate into
> cluster B.
>         > > > > > >>
>         > > > > > >> Since in both causes the configuration for cluster B
> will be
>         > > passed
>         > > > > into
>         > > > > > >> the connector in the ConnectorConfig contained in the
> rest
>         > > request,
>         > > > > > what's
>         > > > > > >> to stop us from starting a third connector with a
>         > MirrorSourceTask
>         > > > > > reading
>         > > > > > >> from cluster C?
>         > > > > > >>
>         > > > > > >> I am a bit hesitant about the entire concept of
> having extra
>         > > scripts
>         > > > > to
>         > > > > > >> run an entire separate Connect cluster - I'd much
> prefer an
>         > option
>         > > > to
>         > > > > > use a
>         > > > > > >> regular connect cluster from an ops point of view. Is
> it maybe
>         > > worth
>         > > > > > >> spending some time investigating whether we can come
> up with a
>         > > > change
>         > > > > to
>         > > > > > >> connect that enables what MM would need?
>         > > > > > >>
>         > > > > > >> Best regards,
>         > > > > > >> Sönke
>         > > > > > >>
>         > > > > > >>
>         > > > > > >>
>         > > > > > >> On Tue, Nov 27, 2018 at 10:02 PM Ryanne Dolan <
>         > > > [email protected]>
>         > > > > > >> wrote:
>         > > > > > >>
>         > > > > > >>> Hey y'all, I'd like you draw your attention to a new
> section in
>         > > > > KIP-382
>         > > > > > >>> re
>         > > > > > >>> MirrorMaker Clusters:
>         > > > > > >>>
>         > > > > > >>>
>         > > > > > >>>
>         > > > > >
>         > > > >
>         > > >
>         > >
>         >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-382:+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-MirrorMakerClusters
>         > > > > > >>>
>         > > > > > >>> A common concern I hear about using Connect for
> replication is
>         > > that
>         > > > > all
>         > > > > > >>> SourceConnectors in a Connect cluster must use the
> same target
>         > > > Kafka
>         > > > > > >>> cluster, and likewise all SinkConnectors must use
> the same
>         > source
>         > > > > Kafka
>         > > > > > >>> cluster. In order to use multiple Kafka clusters
> from Connect,
>         > > > there
>         > > > > > are
>         > > > > > >>> two possible approaches:
>         > > > > > >>>
>         > > > > > >>> 1) use an intermediate Kafka cluster, K.
> SourceConnectors (A,
>         > B,
>         > > C)
>         > > > > > write
>         > > > > > >>> to K and SinkConnectors (X, Y, Z) read from K. This
> enables
>         > flows
>         > > > > like
>         > > > > > A
>         > > > > > >>> ->
>         > > > > > >>> K - > X but means that some topologies require
> extraneous hops,
>         > > and
>         > > > > > means
>         > > > > > >>> that K must be scaled to handle records from all
> sources and
>         > > sinks.
>         > > > > > >>>
>         > > > > > >>> 2) use multiple Connect clusters, one for each
> target cluster.
>         > > Each
>         > > > > > >>> cluster
>         > > > > > >>> has multiple SourceConnectors, one for each source
> cluster.
>         > This
>         > > > > > enables
>         > > > > > >>> direct replication of A -> X but means there is a
> proliferation
>         > > of
>         > > > > > >>> Connect
>         > > > > > >>> clusters, each of which must be managed separately.
>         > > > > > >>>
>         > > > > > >>> Both options are viable for small deployments
> involving a small
>         > > > > number
>         > > > > > of
>         > > > > > >>> Kafka clusters in a small number of data centers.
> However,
>         > > neither
>         > > > is
>         > > > > > >>> scalable, especially from an operational standpoint.
>         > > > > > >>>
>         > > > > > >>> KIP-382 now introduces "MirrorMaker clusters", which
> are
>         > distinct
>         > > > > from
>         > > > > > >>> Connect clusters. A single MirrorMaker cluster
> provides
>         > > > > > >>> "Replication-as-a-Service" among any number of Kafka
> clusters
>         > > via a
>         > > > > > >>> high-level REST API based on the Connect API. Under
> the hood,
>         > > > > > MirrorMaker
>         > > > > > >>> sets up Connectors between each pair of Kafka
> clusters. The
>         > REST
>         > > > API
>         > > > > > >>> enables on-the-fly reconfiguration of each
> Connector, including
>         > > > > updates
>         > > > > > >>> to
>         > > > > > >>> topic whitelists/blacklists.
>         > > > > > >>>
>         > > > > > >>> To configure MirrorMaker 2.0, you need a
> configuration file
>         > that
>         > > > > lists
>         > > > > > >>> connection information for each Kafka cluster
> (broker lists,
>         > SSL
>         > > > > > settings
>         > > > > > >>> etc). At a minimum, this looks like:
>         > > > > > >>>
>         > > > > > >>> clusters=us-west, us-east
>         > > > > > >>> cluster.us-west.broker.list=us-west-kafka-server:9092
>         > > > > > >>> cluster.us-east.broker.list=us-east-kafka-server:9092
>         > > > > > >>>
>         > > > > > >>> You can specify topic whitelists and other
> connector-level
>         > > settings
>         > > > > > here
>         > > > > > >>> too, or you can use the REST API to remote-control a
> running
>         > > > cluster.
>         > > > > > >>>
>         > > > > > >>> I've also updated the KIP with minor changes to
> bring it in
>         > line
>         > > > with
>         > > > > > the
>         > > > > > >>> current implementation.
>         > > > > > >>>
>         > > > > > >>> Looking forward to your feedback, thanks!
>         > > > > > >>> Ryanne
>         > > > > > >>>
>         > > > > > >>> On Mon, Nov 19, 2018 at 10:26 PM Ryanne Dolan <
>         > > > [email protected]
>         > > > > >
>         > > > > > >>> wrote:
>         > > > > > >>>
>         > > > > > >>> > Dan, you've got it right. ACL sync will be done by
> MM2
>         > > > > automatically
>         > > > > > >>> > (unless disabled) according to simple rules:
>         > > > > > >>> >
>         > > > > > >>> > - If a principal has READ access on a topic in a
> source
>         > > cluster,
>         > > > > the
>         > > > > > >>> same
>         > > > > > >>> > principal should have READ access on downstream
> replicated
>         > > topics
>         > > > > > >>> ("remote
>         > > > > > >>> > topics").
>         > > > > > >>> > - Only MM2 has WRITE access on "remote topics".
>         > > > > > >>> >
>         > > > > > >>> > This covers sync from upstream topics like
> "topic1" to
>         > > downstream
>         > > > > > >>> remote
>         > > > > > >>> > topics like "us-west.topic1". What's missing from
> the KIP, as
>         > > you
>         > > > > > point
>         > > > > > >>> > out, is ACL sync between normal topics
> (non-remote). If a
>         > > > consumer
>         > > > > > has
>         > > > > > >>> READ
>         > > > > > >>> > access to topic1 in an upstream cluster, should it
> have READ
>         > > > access
>         > > > > > in
>         > > > > > >>> > topic1 in a downstream cluster?
>         > > > > > >>> >
>         > > > > > >>> > I think the answer generally is no, you don't want
> to give
>         > > > > principals
>         > > > > > >>> > blanket permissions across all DCs automatically.
> For
>         > example,
>         > > > I've
>         > > > > > >>> seen
>         > > > > > >>> > scenarios where certain topics are replicated
> between an
>         > > internal
>         > > > > and
>         > > > > > >>> > external Kafka cluster. You don't want to
> accidentally push
>         > ACL
>         > > > > > changes
>         > > > > > >>> > across this boundary.
>         > > > > > >>> >
>         > > > > > >>> > Moreover, it's clear that MM2 "owns" downstream
> remote topics
>         > > > like
>         > > > > > >>> > "us-west.topic1" -- MM2 is the only producer and
> the only
>         > admin
>         > > > of
>         > > > > > >>> these
>         > > > > > >>> > topics -- so it's natural to have MM2 set the ACL
> for these
>         > > > topics.
>         > > > > > >>> But I
>         > > > > > >>> > think it would be surprising if MM2 tried to
> manipulate
>         > topics
>         > > it
>         > > > > > >>> doesn't
>         > > > > > >>> > own. So I think granting permissions across DCs is
> probably
>         > > > outside
>         > > > > > >>> MM2's
>         > > > > > >>> > purview, but I agree it'd be nice to have tooling
> to help
>         > with
>         > > > > this.
>         > > > > > >>> >
>         > > > > > >>> > Thanks.
>         > > > > > >>> > Ryanne
>         > > > > > >>> >
>         > > > > > >>> > --
>         > > > > > >>> > www.ryannedolan.info
>         > > > > > >>> >
>         > > > > > >>> >
>         > > > > > >>> > On Mon, Nov 19, 2018 at 3:58 PM
> [email protected] <
>         > > > > > >>> > [email protected]> wrote:
>         > > > > > >>> >
>         > > > > > >>> >> Hi guys,
>         > > > > > >>> >>
>         > > > > > >>> >> This is an exciting topic. could I have a word
> here?
>         > > > > > >>> >> I understand there are many scenarios that we can
> apply
>         > > > > mirrormaker.
>         > > > > > >>> I am
>         > > > > > >>> >> at the moment working on active/active DC
> solution using
>         > > > > > MirrorMaker;
>         > > > > > >>> our
>         > > > > > >>> >> goal is to allow  the clients to failover to
> connect the
>         > other
>         > > > > kafka
>         > > > > > >>> >> cluster (on the other DC) when an incident
> happens.
>         > > > > > >>> >>
>         > > > > > >>> >> To do this, I need:
>         > > > > > >>> >> 1 MirrorMaker to replicate the partitioned
> messages in a
>         > > > > sequential
>         > > > > > >>> order
>         > > > > > >>> >> (in timely fashion) to the same partition on the
> other
>         > cluster
>         > > > > (also
>         > > > > > >>> need
>         > > > > > >>> >> keep the promise that both clusters creates the
> same number
>         > of
>         > > > > > >>> partitions
>         > > > > > >>> >> for a topic) – so that a consumer can pick up the
> right
>         > order
>         > > of
>         > > > > the
>         > > > > > >>> latest
>         > > > > > >>> >> messages
>         > > > > > >>> >> 2 MirrorMaker to replicate the local consumer
> offset to the
>         > > > other
>         > > > > > >>> side –
>         > > > > > >>> >> so that the consumer knows where is the offset/
> latest
>         > > messages
>         > > > > > >>> >> 3 MirrorMaker to provide cycle detection for
> messages across
>         > > the
>         > > > > > DCs.
>         > > > > > >>> >>
>         > > > > > >>> >> I can see the possibility for Remote Topic to
> solve all
>         > these
>         > > > > > >>> problems,
>         > > > > > >>> >> as long as the consumer can see the remote topic
> equally as
>         > > the
>         > > > > > local
>         > > > > > >>> >> topic, i.e. For a consumer which has a permission
> to consume
>         > > > > topic1,
>         > > > > > >>> on
>         > > > > > >>> >> subscribe event it can automatically subscribe
> both
>         > > > remote.topic1
>         > > > > > and
>         > > > > > >>> >> local.topic1. First we need to find a way for
> topic ACL
>         > > granting
>         > > > > for
>         > > > > > >>> the
>         > > > > > >>> >> consumer across the DCs. Secondly the consumer
> need to be
>         > able
>         > > > to
>         > > > > > >>> subscribe
>         > > > > > >>> >> topics with wildcard or suffix. Last but not the
> least, the
>         > > > > consumer
>         > > > > > >>> has to
>         > > > > > >>> >> deal with the timely ordering of the messages
> from the 2
>         > > topics.
>         > > > > > >>> >>
>         > > > > > >>> >> My understanding is, all of these should be
> configurable to
>         > be
>         > > > > > turned
>         > > > > > >>> on
>         > > > > > >>> >> or off, to fit for different use cases.
>         > > > > > >>> >>
>         > > > > > >>> >> Interesting I was going to support topic messages
> with extra
>         > > > > headers
>         > > > > > >>> of
>         > > > > > >>> >> source DC info, for cycle detection…..
>         > > > > > >>> >>
>         > > > > > >>> >> Looking forward your reply.
>         > > > > > >>> >>
>         > > > > > >>> >> Regards,
>         > > > > > >>> >>
>         > > > > > >>> >> Dan
>         > > > > > >>> >> On 2018/10/23 19:56:02, Ryanne Dolan <
> [email protected]
>         > >
>         > > > > wrote:
>         > > > > > >>> >> > Alex, thanks for the feedback.
>         > > > > > >>> >> >
>         > > > > > >>> >> > > Would it be possible to utilize the
>         > > > > > >>> >> > > Message Headers feature to prevent infinite
> recursion
>         > > > > > >>> >> >
>         > > > > > >>> >> > This isn't necessary due to the topic renaming
> feature
>         > which
>         > > > > > already
>         > > > > > >>> >> > prevents infinite recursion.
>         > > > > > >>> >> >
>         > > > > > >>> >> > If you turn off topic renaming you lose cycle
> detection,
>         > so
>         > > > > maybe
>         > > > > > we
>         > > > > > >>> >> could
>         > > > > > >>> >> > provide message headers as an optional second
> mechanism.
>         > I'm
>         > > > not
>         > > > > > >>> >> opposed to
>         > > > > > >>> >> > that idea, but there are ways to improve
> efficiency if we
>         > > > don't
>         > > > > > >>> need to
>         > > > > > >>> >> > modify or inspect individual records.
>         > > > > > >>> >> >
>         > > > > > >>> >> > Ryanne
>         > > > > > >>> >> >
>         > > > > > >>> >> > On Tue, Oct 23, 2018 at 6:06 AM Alex Mironov <
>         > > > > > [email protected]
>         > > > > > >>> >
>         > > > > > >>> >> wrote:
>         > > > > > >>> >> >
>         > > > > > >>> >> > > Hey Ryanne,
>         > > > > > >>> >> > >
>         > > > > > >>> >> > > Awesome KIP, exited to see improvements in
> MirrorMaker
>         > > > land, I
>         > > > > > >>> >> particularly
>         > > > > > >>> >> > > like the reuse of Connect framework! Would it
> be
>         > possible
>         > > to
>         > > > > > >>> utilize
>         > > > > > >>> >> the
>         > > > > > >>> >> > > Message Headers feature to prevent infinite
> recursion?
>         > For
>         > > > > > >>> example,
>         > > > > > >>> >> MM2
>         > > > > > >>> >> > > could stamp every message with a special
> header payload
>         > > > (e.g.
>         > > > > > >>> >> > > MM2="cluster-name-foo") so in case another
> MM2 instance
>         > > sees
>         > > > > > this
>         > > > > > >>> >> message
>         > > > > > >>> >> > > and it is configured to replicate data into
>         > > > "cluster-name-foo"
>         > > > > > it
>         > > > > > >>> >> would
>         > > > > > >>> >> > > just skip it instead of replicating it back.
>         > > > > > >>> >> > >
>         > > > > > >>> >> > > On Sat, Oct 20, 2018 at 5:48 AM Ryanne Dolan <
>         > > > > > >>> [email protected]>
>         > > > > > >>> >> > > wrote:
>         > > > > > >>> >> > >
>         > > > > > >>> >> > > > Thanks Harsha. Done.
>         > > > > > >>> >> > > >
>         > > > > > >>> >> > > > On Fri, Oct 19, 2018 at 1:03 AM Harsha
> Chintalapani <
>         > > > > > >>> >> [email protected]>
>         > > > > > >>> >> > > > wrote:
>         > > > > > >>> >> > > >
>         > > > > > >>> >> > > > > Ryanne,
>         > > > > > >>> >> > > > >        Makes sense. Can you please add
> this under
>         > > > rejected
>         > > > > > >>> >> alternatives
>         > > > > > >>> >> > > > so
>         > > > > > >>> >> > > > > that everyone has context on why it
> wasn’t picked.
>         > > > > > >>> >> > > > >
>         > > > > > >>> >> > > > > Thanks,
>         > > > > > >>> >> > > > > Harsha
>         > > > > > >>> >> > > > > On Oct 18, 2018, 8:02 AM -0700, Ryanne
> Dolan <
>         > > > > > >>> >> [email protected]>,
>         > > > > > >>> >> > > > > wrote:
>         > > > > > >>> >> > > > >
>         > > > > > >>> >> > > > > Harsha, concerning uReplicator
> specifically, the
>         > > project
>         > > > > is
>         > > > > > a
>         > > > > > >>> >> major
>         > > > > > >>> >> > > > > inspiration for MM2, but I don't think it
> is a good
>         > > > > > >>> foundation for
>         > > > > > >>> >> > > > anything
>         > > > > > >>> >> > > > > included in Apache Kafka. uReplicator
> uses Helix to
>         > > > solve
>         > > > > > >>> >> problems that
>         > > > > > >>> >> > > > > Connect also solves, e.g. REST API, live
>         > configuration
>         > > > > > >>> changes,
>         > > > > > >>> >> cluster
>         > > > > > >>> >> > > > > management, coordination etc. This also
> means that
>         > > > > existing
>         > > > > > >>> >> tooling,
>         > > > > > >>> >> > > > > dashboards etc that work with Connectors
> do not work
>         > > > with
>         > > > > > >>> >> uReplicator,
>         > > > > > >>> >> > > > and
>         > > > > > >>> >> > > > > any future tooling would need to treat
> uReplicator
>         > as
>         > > a
>         > > > > > >>> special
>         > > > > > >>> >> case.
>         > > > > > >>> >> > > > >
>         > > > > > >>> >> > > > > Ryanne
>         > > > > > >>> >> > > > >
>         > > > > > >>> >> > > > > On Wed, Oct 17, 2018 at 12:30 PM Ryanne
> Dolan <
>         > > > > > >>> >> [email protected]>
>         > > > > > >>> >> > > > > wrote:
>         > > > > > >>> >> > > > >
>         > > > > > >>> >> > > > >> Harsha, yes I can do that. I'll update
> the KIP
>         > > > > accordingly,
>         > > > > > >>> >> thanks.
>         > > > > > >>> >> > > > >>
>         > > > > > >>> >> > > > >> Ryanne
>         > > > > > >>> >> > > > >>
>         > > > > > >>> >> > > > >> On Wed, Oct 17, 2018 at 12:18 PM Harsha <
>         > > > [email protected]
>         > > > > >
>         > > > > > >>> wrote:
>         > > > > > >>> >> > > > >>
>         > > > > > >>> >> > > > >>> Hi Ryanne,
>         > > > > > >>> >> > > > >>>                Thanks for the KIP. I am
> also
>         > curious
>         > > > > about
>         > > > > > >>> why
>         > > > > > >>> >> not
>         > > > > > >>> >> > > use
>         > > > > > >>> >> > > > >>> the uReplicator design as the
> foundation given it
>         > > > > alreadys
>         > > > > > >>> >> resolves
>         > > > > > >>> >> > > > some of
>         > > > > > >>> >> > > > >>> the fundamental issues in current
> MIrrorMaker,
>         > > > updating
>         > > > > > the
>         > > > > > >>> >> confifgs
>         > > > > > >>> >> > > > on the
>         > > > > > >>> >> > > > >>> fly and running the mirror maker agents
> in a
>         > worker
>         > > > > model
>         > > > > > >>> which
>         > > > > > >>> >> can
>         > > > > > >>> >> > > > >>> deployed in mesos or container
> orchestrations.  If
>         > > > > > possible
>         > > > > > >>> can
>         > > > > > >>> >> you
>         > > > > > >>> >> > > > >>> document in the rejected alternatives
> what are
>         > > missing
>         > > > > > parts
>         > > > > > >>> >> that
>         > > > > > >>> >> > > made
>         > > > > > >>> >> > > > you
>         > > > > > >>> >> > > > >>> to consider a new design from ground up.
>         > > > > > >>> >> > > > >>>
>         > > > > > >>> >> > > > >>> Thanks,
>         > > > > > >>> >> > > > >>> Harsha
>         > > > > > >>> >> > > > >>>
>         > > > > > >>> >> > > > >>> On Wed, Oct 17, 2018, at 8:34 AM,
> Ryanne Dolan
>         > > wrote:
>         > > > > > >>> >> > > > >>> > Jan, these are two separate issues.
>         > > > > > >>> >> > > > >>> >
>         > > > > > >>> >> > > > >>> > 1) consumer coordination should not,
> ideally,
>         > > > involve
>         > > > > > >>> >> unreliable or
>         > > > > > >>> >> > > > >>> slow
>         > > > > > >>> >> > > > >>> > connections. Naively, a
> KafkaSourceConnector
>         > would
>         > > > > > >>> coordinate
>         > > > > > >>> >> via
>         > > > > > >>> >> > > the
>         > > > > > >>> >> > > > >>> > source cluster. We can do better than
> this, but
>         > > I'm
>         > > > > > >>> deferring
>         > > > > > >>> >> this
>         > > > > > >>> >> > > > >>> > optimization for now.
>         > > > > > >>> >> > > > >>> >
>         > > > > > >>> >> > > > >>> > 2) exactly-once between two clusters
> is
>         > > > mind-bending.
>         > > > > > But
>         > > > > > >>> >> keep in
>         > > > > > >>> >> > > > mind
>         > > > > > >>> >> > > > >>> that
>         > > > > > >>> >> > > > >>> > transactions are managed by the
> producer, not
>         > the
>         > > > > > >>> consumer. In
>         > > > > > >>> >> > > fact,
>         > > > > > >>> >> > > > >>> it's
>         > > > > > >>> >> > > > >>> > the producer that requests that
> offsets be
>         > > committed
>         > > > > for
>         > > > > > >>> the
>         > > > > > >>> >> > > current
>         > > > > > >>> >> > > > >>> > transaction. Obviously, these offsets
> are
>         > > committed
>         > > > in
>         > > > > > >>> >> whatever
>         > > > > > >>> >> > > > >>> cluster the
>         > > > > > >>> >> > > > >>> > producer is sending to.
>         > > > > > >>> >> > > > >>> >
>         > > > > > >>> >> > > > >>> > These two issues are closely related.
> They are
>         > > both
>         > > > > > >>> resolved
>         > > > > > >>> >> by not
>         > > > > > >>> >> > > > >>> > coordinating or committing via the
> source
>         > cluster.
>         > > > And
>         > > > > > in
>         > > > > > >>> >> fact,
>         > > > > > >>> >> > > this
>         > > > > > >>> >> > > > >>> is the
>         > > > > > >>> >> > > > >>> > general model of SourceConnectors
> anyway, since
>         > > most
>         > > > > > >>> >> > > SourceConnectors
>         > > > > > >>> >> > > > >>> > _only_ have a destination cluster.
>         > > > > > >>> >> > > > >>> >
>         > > > > > >>> >> > > > >>> > If there is a lot of interest here, I
> can
>         > expound
>         > > > > > further
>         > > > > > >>> on
>         > > > > > >>> >> this
>         > > > > > >>> >> > > > >>> aspect of
>         > > > > > >>> >> > > > >>> > MM2, but again I think this is
> premature until
>         > > this
>         > > > > > first
>         > > > > > >>> KIP
>         > > > > > >>> >> is
>         > > > > > >>> >> > > > >>> approved.
>         > > > > > >>> >> > > > >>> > I intend to address each of these in
> separate
>         > KIPs
>         > > > > > >>> following
>         > > > > > >>> >> this
>         > > > > > >>> >> > > > one.
>         > > > > > >>> >> > > > >>> >
>         > > > > > >>> >> > > > >>> > Ryanne
>         > > > > > >>> >> > > > >>> >
>         > > > > > >>> >> > > > >>> > On Wed, Oct 17, 2018 at 7:09 AM Jan
> Filipiak <
>         > > > > > >>> >> > > > [email protected]
>         > > > > > >>> >> > > > >>> >
>         > > > > > >>> >> > > > >>> > wrote:
>         > > > > > >>> >> > > > >>> >
>         > > > > > >>> >> > > > >>> > > This is not a performance
> optimisation. Its a
>         > > > > > >>> fundamental
>         > > > > > >>> >> design
>         > > > > > >>> >> > > > >>> choice.
>         > > > > > >>> >> > > > >>> > >
>         > > > > > >>> >> > > > >>> > >
>         > > > > > >>> >> > > > >>> > > I never really took a look how
> streams does
>         > > > exactly
>         > > > > > >>> once.
>         > > > > > >>> >> (its a
>         > > > > > >>> >> > > > trap
>         > > > > > >>> >> > > > >>> > > anyways and you usually can deal
> with at least
>         > > > once
>         > > > > > >>> >> donwstream
>         > > > > > >>> >> > > > pretty
>         > > > > > >>> >> > > > >>> > > easy). But I am very certain its
> not gonna get
>         > > > > > >>> somewhere if
>         > > > > > >>> >> > > offset
>         > > > > > >>> >> > > > >>> > > commit and record produce cluster
> are not the
>         > > > same.
>         > > > > > >>> >> > > > >>> > >
>         > > > > > >>> >> > > > >>> > > Pretty sure without this _design
> choice_ you
>         > can
>         > > > > skip
>         > > > > > on
>         > > > > > >>> >> that
>         > > > > > >>> >> > > > exactly
>         > > > > > >>> >> > > > >>> > > once already
>         > > > > > >>> >> > > > >>> > >
>         > > > > > >>> >> > > > >>> > > Best Jan
>         > > > > > >>> >> > > > >>> > >
>         > > > > > >>> >> > > > >>> > > On 16.10.2018 18:16, Ryanne Dolan
> wrote:
>         > > > > > >>> >> > > > >>> > > >  >  But one big obstacle in this
> was
>         > > > > > >>> >> > > > >>> > > > always that group coordination
> happened on
>         > the
>         > > > > > source
>         > > > > > >>> >> cluster.
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > > Jan, thank you for bringing up
> this issue
>         > with
>         > > > > > legacy
>         > > > > > >>> >> > > > MirrorMaker.
>         > > > > > >>> >> > > > >>> I
>         > > > > > >>> >> > > > >>> > > > totally agree with you. This is
> one of
>         > several
>         > > > > > >>> problems
>         > > > > > >>> >> with
>         > > > > > >>> >> > > > >>> MirrorMaker
>         > > > > > >>> >> > > > >>> > > > I intend to solve in MM2, and I
> already
>         > have a
>         > > > > > design
>         > > > > > >>> and
>         > > > > > >>> >> > > > >>> prototype that
>         > > > > > >>> >> > > > >>> > > > solves this and related issues.
> But as you
>         > > > pointed
>         > > > > > >>> out,
>         > > > > > >>> >> this
>         > > > > > >>> >> > > KIP
>         > > > > > >>> >> > > > is
>         > > > > > >>> >> > > > >>> > > > already rather complex, and I
> want to focus
>         > on
>         > > > the
>         > > > > > >>> core
>         > > > > > >>> >> feature
>         > > > > > >>> >> > > > set
>         > > > > > >>> >> > > > >>> > > > rather than performance
> optimizations for
>         > now.
>         > > > If
>         > > > > we
>         > > > > > >>> can
>         > > > > > >>> >> agree
>         > > > > > >>> >> > > on
>         > > > > > >>> >> > > > >>> what
>         > > > > > >>> >> > > > >>> > > > MM2 looks like, it will be very
> easy to
>         > agree
>         > > to
>         > > > > > >>> improve
>         > > > > > >>> >> its
>         > > > > > >>> >> > > > >>> performance
>         > > > > > >>> >> > > > >>> > > > and reliability.
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > > That said, I look forward to your
> support
>         > on a
>         > > > > > >>> subsequent
>         > > > > > >>> >> KIP
>         > > > > > >>> >> > > > that
>         > > > > > >>> >> > > > >>> > > > addresses consumer coordination
> and
>         > rebalance
>         > > > > > issues.
>         > > > > > >>> Stay
>         > > > > > >>> >> > > tuned!
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > > Ryanne
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > > On Tue, Oct 16, 2018 at 6:58 AM
> Jan
>         > Filipiak <
>         > > > > > >>> >> > > > >>> [email protected]
>         > > > > > >>> >> > > > >>> > > > <mailto:[email protected]>>
> wrote:
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > >     Hi,
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > >     Currently MirrorMaker is
> usually run
>         > > > > collocated
>         > > > > > >>> with
>         > > > > > >>> >> the
>         > > > > > >>> >> > > > target
>         > > > > > >>> >> > > > >>> > > >     cluster.
>         > > > > > >>> >> > > > >>> > > >     This is all nice and good.
> But one big
>         > > > > obstacle
>         > > > > > in
>         > > > > > >>> >> this was
>         > > > > > >>> >> > > > >>> > > >     always that group
> coordination happened
>         > on
>         > > > the
>         > > > > > >>> source
>         > > > > > >>> >> > > > cluster.
>         > > > > > >>> >> > > > >>> So
>         > > > > > >>> >> > > > >>> > > when
>         > > > > > >>> >> > > > >>> > > >     then network was congested,
> you
>         > sometimes
>         > > > > loose
>         > > > > > >>> group
>         > > > > > >>> >> > > > >>> membership and
>         > > > > > >>> >> > > > >>> > > >     have to rebalance and all
> this.
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > >     So one big request from we
> would be the
>         > > > > support
>         > > > > > of
>         > > > > > >>> >> having
>         > > > > > >>> >> > > > >>> > > coordination
>         > > > > > >>> >> > > > >>> > > >     cluster != source cluster.
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > >     I would generally say a LAN
> is better
>         > > than a
>         > > > > WAN
>         > > > > > >>> for
>         > > > > > >>> >> doing
>         > > > > > >>> >> > > > >>> group
>         > > > > > >>> >> > > > >>> > > >     coordinaton and there is no
> reason we
>         > > > couldn't
>         > > > > > >>> have a
>         > > > > > >>> >> group
>         > > > > > >>> >> > > > >>> consuming
>         > > > > > >>> >> > > > >>> > > >     topics from a different
> cluster and
>         > > > committing
>         > > > > > >>> >> offsets to
>         > > > > > >>> >> > > > >>> another
>         > > > > > >>> >> > > > >>> > > >     one right?
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > >     Other than that. It feels
> like the KIP
>         > has
>         > > > too
>         > > > > > >>> much
>         > > > > > >>> >> > > features
>         > > > > > >>> >> > > > >>> where
>         > > > > > >>> >> > > > >>> > > many
>         > > > > > >>> >> > > > >>> > > >     of them are not really wanted
> and
>         > counter
>         > > > > > >>> productive
>         > > > > > >>> >> but I
>         > > > > > >>> >> > > > >>> will just
>         > > > > > >>> >> > > > >>> > > >     wait and see how the
> discussion goes.
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > >     Best Jan
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > > >     On 15.10.2018 18:16, Ryanne
> Dolan wrote:
>         > > > > > >>> >> > > > >>> > > >      > Hey y'all!
>         > > > > > >>> >> > > > >>> > > >      >
>         > > > > > >>> >> > > > >>> > > >      > Please take a look at
> KIP-382:
>         > > > > > >>> >> > > > >>> > > >      >
>         > > > > > >>> >> > > > >>> > > >      >
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > >
>         > > > > > >>> >> > > > >>>
>         > > > > > >>> >> > > >
>         > > > > > >>> >> > >
>         > > > > > >>> >>
>         > > > > > >>>
>         > > > > >
>         > > > >
>         > > >
>         > >
>         >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0
>         > > > > > >>> >> > > > >>> > > >      >
>         > > > > > >>> >> > > > >>> > > >      > Thanks for your feedback
> and support.
>         > > > > > >>> >> > > > >>> > > >      >
>         > > > > > >>> >> > > > >>> > > >      > Ryanne
>         > > > > > >>> >> > > > >>> > > >      >
>         > > > > > >>> >> > > > >>> > > >
>         > > > > > >>> >> > > > >>> > >
>         > > > > > >>> >> > > > >>>
>         > > > > > >>> >> > > > >>
>         > > > > > >>> >> > > >
>         > > > > > >>> >> > >
>         > > > > > >>> >> > >
>         > > > > > >>> >> > > --
>         > > > > > >>> >> > > Best,
>         > > > > > >>> >> > > Alex Mironov
>         > > > > > >>> >> > >
>         > > > > > >>> >> >
>         > > > > > >>> >>
>         > > > > > >>> >
>         > > > > > >>>
>         > > > > > >>
>         > > > > > >>
>         > > > > > >> --
>         > > > > > >> Sönke Liebau
>         > > > > > >> Partner
>         > > > > > >> Tel. +49 179 7940878
>         > > > > > >> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880
> Wedel -
>         > > > Germany
>         > > > > > >>
>         > > > > > >
>         > > > > >
>         > > > > > --
>         > > > > > Sönke Liebau
>         > > > > > Partner
>         > > > > > Tel. +49 179 7940878
>         > > > > > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880
> Wedel -
>         > Germany
>         > > > > >
>         > > > >
>         > > >
>         > > >
>         > > > --
>         > > > Sönke Liebau
>         > > > Partner
>         > > > Tel. +49 179 7940878
>         > > > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel
> - Germany
>         > > >
>         > >
>         >
>         >
>         > --
>         > Sönke Liebau
>         > Partner
>         > Tel. +49 179 7940878
>         > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel -
> Germany
>         >
>
>
>
>
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to others
> this message or any attachment. Please also notify the sender by replying
> to this email or by telephone (+44(020 7896 0011) and then delete the email
> and any copies of it. Opinions, conclusion (etc) that do not relate to the
> official business of this company shall be understood as neither given nor
> endorsed by it. IG is a trading name of IG Markets Limited (a company
> registered in England and Wales, company number 04008957) and IG Index
> Limited (a company registered in England and Wales, company number
> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> Index Limited (register number 114059) are authorised and regulated by the
> Financial Conduct Authority.
>

Re: [DISCUSS] KIP-382: MirrorMaker 2.0

Reply via email to