Wikimedia currently implements 'master <-> master' replication by manually prefixing topics with datacenter names, and then configuring MirrorMaker to only replicate topics that begin with a DC name to another.
While having topics named with topological details is manageable, I wouldn't say it is desirable. It pushes knowledge of the replication topology up to clients. Even if MirrorMaker was the one doing the topic prefixing, downstream consumers of a group of replicated topics are still going to have to know to subscribe to the correctly prefixed topics. If possible I'd much prefer header + hops based replication rather than lots of renamed topics. But either way, this KIP would be tremendously useful to us so I support it all the way! :) On Tue, Dec 11, 2018 at 5:32 AM Michael Pearce <michael.pea...@ig.com> wrote: > So this is indeed what using headers with hops avoids is creating lots and > lots of topics __, so you can have more complex topology setups. > > I ask why not support having two ways of setting up and closing the door? > > One based on hops using headers, and another based on topic naming. After > all flexibility is what we want its for end users how to use right? > > > > On 12/7/18, 8:19 PM, "Ryanne Dolan" <ryannedo...@gmail.com> wrote: > > Michael, thanks for the comments! > > > would like to see support for this to be done by hops, as well [...] > This then allows ring (hops = number of brokers in the ring), mesh > (every > cluster interconnected so hop=1), or even a tree (more fine grained > setup) > cluster topology. > > That's a good idea, though we can do this at the topic level without > tagging individual records. A max.hop of 1 would mean "A.topic1" is > allowed, but not "B.A.topic1". I think the default behavior would need > to > be max.hops = 1 to avoid unexpectedly creating a bunch of D.C.B.A... > topics > when you create a fully-connected mesh topology. > > Looking ahead a bit, I can imagine an external tool computing the > spanning > tree of topics among a set of clusters based on inter-cluster > replication > lag, and setting up MM2 accordingly. But that's probably outside the > scope > of this KIP :) > > > ...standalone MirrorMaker connector... > > ./bin/kafka-mirror-maker-2.sh --consumer consumer.properties > --producer producer.properties > > Eventually, I'd like MM2 to completely replace legacy MM, including the > ./bin/kafka-mirror-maker.sh script. In the meantime, it's a good idea > to > include a standalone driver. Something like > ./bin/connect-mirror-maker-standalone.sh with the same high-level > configuration file. I'll do that, thanks. > > > I see no section on providing support for mirror maker Handlers, > today > people can add handlers to have a little extra custom logic if needed, > and > the handler api is public today so should be supported going forwards > so > people are not on mass re-writing these. > > Great point. Connect offers single-message transformations and > converters > for this purpose, but I agree that we should honor the existing API if > possible. This might be as easy as providing an adapter class between > connect's Transformation and mirror-maker's Handler. Maybe file a Jira > ticket to track this? > > Really appreciate your feedback! > > Ryanne > > > On Thu, Dec 6, 2018 at 7:03 PM Michael Pearce <michael.pea...@ig.com> > wrote: > > > Re hops to stop the cycle and to allow a range of multi cluster > > topologies, see https://www.rabbitmq.com/federated-exchanges.html > where > > very similar was done in rabbit. > > > > > > > > On 12/7/18, 12:47 AM, "Michael Pearce" <michael.pea...@ig.com> > wrote: > > > > Nice proposal. > > > > Some comments. > > > > > > On the section around cycle detection. > > > > I would like to see support for this to be done by hops, as well > e.g. > > using approach is to use a header for the number of hops, as the mm2 > > replicates it increases the hop count and you can make the mm2 > configurable > > to only produce messages onwards where hops are less than x. > > This then allows ring (hops = number of brokers in the ring), > mesh > > (every cluster interconnected so hop=1), or even a tree (more fine > grained > > setup) cluster topology. > > FYI we do this currently with the current mirror maker, using a > custom > > handler. > > > > > > On the section around running a standalone MirrorMaker connector > > > > I would suggest making this as easy to run as the mirrormakers > are > > today, with a simple single sh script. > > I assume this is what is proposed in section "Running > MirrorMaker in > > legacy mode" but I would even do this before MM would be removed, > with a -2 > > varient. > > e.g. > > ./bin/kafka-mirror-maker-2.sh --consumer consumer.properties > > --producer producer.properties > > > > Lastly > > > > I see no section on providing support for mirror maker Handlers, > today > > people can add handlers to have a little extra custom logic if > needed, and > > the handler api is public today so should be supported going > forwards so > > people are not on mass re-writing these. > > > > On 12/5/18, 5:36 PM, "Ryanne Dolan" <ryannedo...@gmail.com> > wrote: > > > > Sönke, > > > > > The only thing that I could come up with is the limitation > to a > > single > > offset commit interval > > > > Yes, and other internal properties, e.g. those used by the > internal > > consumers and producers, which, granted, probably are not > often > > changed > > from their defaults, but that apply to Connectors across the > > entire cluster. > > > > Ryanne > > > > On Wed, Dec 5, 2018 at 3:21 AM Sönke Liebau > > <soenke.lie...@opencore.com.invalid> wrote: > > > > > Hi Ryanne, > > > > > > when you say "Currently worker configs apply across the > entire > > cluster, > > > which is limiting even for use-cases involving a single > Kafka > > cluster.", > > > may I ask you to elaborate on those limitations a little? > > > The only thing that I could come up with is the limitation > to a > > single > > > offset commit interval value for all running connectors. > > > Maybe also the limitation to shared config providers.. > > > > > > But you sound like you had painful experiences with this > before, > > maybe > > > you'd like to share the burden :) > > > > > > Best regards, > > > Sönke > > > > > > On Wed, Dec 5, 2018 at 5:15 AM Ryanne Dolan < > > ryannedo...@gmail.com> wrote: > > > > > > > Sönke, > > > > > > > > I think so long as we can keep the differences at a very > high > > level (i.e. > > > > the "control plane"), there is little downside to MM2 and > > Connect > > > > coexisting. I do expect them to converge to some extent, > with > > features > > > from > > > > MM2 being pulled into Connect whenever this is possible > > without breaking > > > > things. > > > > > > > > I could definitely see your idea re hierarchies or > groups of > > connectors > > > > being useful outside MM2. Currently "worker configs" > apply > > across the > > > > entire cluster, which is limiting even for use-cases > involving > > a single > > > > Kafka cluster. If Connect supported multiple workers in > the > > same cluster, > > > > it would start to look a lot like a MM2 cluster. > > > > > > > > Ryanne > > > > > > > > On Tue, Dec 4, 2018 at 3:26 PM Sönke Liebau > > > > <soenke.lie...@opencore.com.invalid> wrote: > > > > > > > > > Hi Ryanne, > > > > > > > > > > thanks for your response! > > > > > > > > > > It seems like you have already done a lot of > investigation > > into the > > > > > existing code and the solution design and all of what > you > > write makes > > > > sense > > > > > to me. Would it potentially be worth adding this to > the KIP, > > now that > > > you > > > > > had to write it up because of me anyway? > > > > > > > > > > However, I am afraid that I am still not entirely > convinced > > of the > > > > > fundamental benefit this provides over an extended > Connect > > that has the > > > > > following functionality: > > > > > - allow for organizing connectors into a hierarchical > > structure - > > > > > "clusters/us-west/..." > > > > > - allow defining external Kafka clusters to be used by > > Source and Sink > > > > > connectors instead of the local cluster > > > > > > > > > > Personally I think both of these features are useful > > additions to > > > > Connect, > > > > > I'll address both separately below. > > > > > > > > > > Allowing to structure connectors in a hierarchy > > > > > Organizing running connectors will grow more important > as > > corporate > > > > > customers adapt Connect and installations grow in size. > > Additionally > > > this > > > > > could be useful for ACLs in case they are ever added to > > Connect, as you > > > > > could allow specific users access only to specific > > namespaces (and > > > until > > > > > ACLs are added it would facilitate using a reverse > proxy for > > the same > > > > > effect). > > > > > > > > > > Allow accessing multiple external clusters > > > > > The reasoning for this feature is pretty much the same > as > > for a central > > > > > Mirror Maker cluster, if a company has multiple > clusters for > > whatever > > > > > reason but wants to have ingest centralized in one > system > > aka one > > > Connect > > > > > cluster they would need the ability to read from and > write > > to an > > > > arbitrary > > > > > number of Kafka clusters. > > > > > I haven't really looked at the code, just poked around > a > > couple of > > > > minutes, > > > > > but it appears like this could be done with fairly low > > effort. My > > > general > > > > > idea would be to leave the existing configuration > options > > untouched - > > > > > Connect will always need a "primary" cluster that is > used > > for storage > > > of > > > > > internal data (config, offsets, status) there is no > need to > > break > > > > existing > > > > > configs. But additionally allow adding named extra > clusters > > by > > > specifying > > > > > options like > > > > > external.sales_cluster.bootstrap_servers=... > > > > > external.sales_cluster.ssl.keystore.location=... > > > > > external.marketing_cluster.bootstrap_servers=... > > > > > > > > > > The code for status, offset and config storage is > mostly > > isolated in > > > the > > > > > Kafka[Offset|Status|Config]BackingStore classes and > could > > remain pretty > > > > > much unchanged. > > > > > > > > > > Producer and consumer creation for Tasks is done in the > > Worker as of > > > > > KAFKA-7551 and is isolated in two functions. We could > add a > > two more > > > > > functions with an extra argument for the external > cluster > > name to be > > > used > > > > > and return fitting consumers/producers. > > > > > The source and sink config would then simply gain an > > optional setting > > > to > > > > > specify the cluster name. > > > > > > > > > > I am very sure that I am missing a few large issues > with > > these ideas, > > > I'm > > > > > mostly back-of-the-napkin designing here, but it might > be > > worth a > > > second > > > > > look. > > > > > > > > > > Once we decide to diverge into two clusters: > MirrorMaker and > > Connect, I > > > > > think realistically the chance of those two ever being > > merged again > > > > because > > > > > they grow back together is practically zero - hence my > > hesitation. > > > > > > > > > > ---- > > > > > > > > > > All of that being said, I am absolutely happy to agree > to > > disagree, I > > > > think > > > > > to a certain extent this is down to a question of > personal > > > > > style/preference. And as this is your baby and you > have put > > a lot more > > > > > effort and thought into it than I ever will I'll shut > up now > > :) > > > > > > > > > > Again, thanks for all your good work! > > > > > > > > > > Best regards, > > > > > Sönke > > > > > > > > > > On Fri, Nov 30, 2018 at 9:00 PM Ryanne Dolan < > > ryannedo...@gmail.com> > > > > > wrote: > > > > > > > > > > > Thanks Sönke. > > > > > > > > > > > > > it just feels to me like an awful lot of Connect > > functionality > > > would > > > > > need > > > > > > to be reimplemented or at least wrapped > > > > > > > > > > > > Connect currently has two drivers, > ConnectDistributed and > > > > > > ConnectStandalone. Both set up a Herder, which > manages > > Workers. I've > > > > > > implemented a third driver which sets up multiple > Herders, > > one for > > > each > > > > > > Kafka cluster as specified in a config file. From the > > Herder level > > > > down, > > > > > > nothing is changed or duplicated -- it's just > Connect. > > > > > > > > > > > > For the REST API, Connect wraps a Herder in a > RestServer > > class, which > > > > > > creates a Jetty server with a few JAX-RS resources. > One of > > these > > > > > resources > > > > > > is ConnectorsResource, which is the real meat of the > REST > > API, > > > enabling > > > > > > start, stop, creation, deletion, and configuration of > > Connectors. > > > > > > > > > > > > I've added MirrorRestServer, which wraps a set of > Herders > > instead of > > > > one. > > > > > > The server exposes a single resource, > ClustersResource, > > which is > > > only a > > > > > few > > > > > > lines of code: > > > > > > > > > > > > @GET > > > > > > @Path("/") > > > > > > public Collection<String> listClusters() { > > > > > > return clusters.keySet(); > > > > > > } > > > > > > > > > > > > @Path("/{cluster}") > > > > > > public ConnectorsResource > > > getConnectorsForCluster(@PathParam("cluster") > > > > > > cluster) { > > > > > > return new > ConnectorsResource(clusters.get(cluster)); > > > > > > } > > > > > > > > > > > > (simplified a bit and subject to change) > > > > > > > > > > > > The ClustersResource defers to the existing > > ConnectorsResource, which > > > > > again > > > > > > is most of the Connect API. With this in place, I > can make > > requests > > > > like: > > > > > > > > > > > > GET /clusters > > > > > > > > > > > > GET /clusters/us-west/connectors > > > > > > > > > > > > PUT /clusters/us-west/connectors/us-east/config > > > > > > { "topics" : "topic1" } > > > > > > > > > > > > etc. > > > > > > > > > > > > So on the whole, very little code is involved in > > implementing > > > > > "MirrorMaker > > > > > > clusters". I won't rule out adding additional > features on > > top of this > > > > > basic > > > > > > API, but nothing should require re-implementing what > is > > already in > > > > > Connect. > > > > > > > > > > > > > Wouldn't it be a viable alternative to look into > > extending Connect > > > > > itself > > > > > > > > > > > > Maybe Connect will evolve to the point where Connect > > clusters and > > > > > > MirrorMaker clusters are indistinguishable, but I > think > > this is > > > > unlikely, > > > > > > since really no use-case outside replication would > benefit > > from the > > > > added > > > > > > complexity. Moreover, I think support for multiple > Kafka > > clusters > > > would > > > > > be > > > > > > hard to add without significant changes to the > existing > > APIs and > > > > configs, > > > > > > which all assume a single Kafka cluster. I think > > Connect-as-a-Service > > > > and > > > > > > Replication-as-a-Service are sufficiently different > > use-cases that we > > > > > > should expect the APIs and configuration files to be > at > > least > > > slightly > > > > > > different, even if both use the same framework > underneath. > > That > > > said, I > > > > > do > > > > > > plan to contribute a few improvements to the Connect > > framework in > > > > support > > > > > > of MM2 -- just nothing within the scope of the > current KIP. > > > > > > > > > > > > Thanks again! > > > > > > Ryanne > > > > > > > > > > > > > > > > > > On Fri, Nov 30, 2018 at 3:47 AM Sönke Liebau > > > > > > <soenke.lie...@opencore.com.invalid> wrote: > > > > > > > > > > > > > Hi Ryanne, > > > > > > > > > > > > > > thanks. I missed the remote to remote replication > > scenario in my > > > > train > > > > > of > > > > > > > thought, you are right. > > > > > > > > > > > > > > That being said I have to admit that I am not yet > fully > > on board > > > with > > > > > the > > > > > > > concept, sorry. But I might just be > misunderstanding > > what your > > > > > intention > > > > > > > is. Let me try and explain what I think it is you > are > > trying to do > > > > and > > > > > > why > > > > > > > I am on the fence about that and take it from > there. > > > > > > > > > > > > > > You want to create an extra mirrormaker driver > class > > which will > > > take > > > > > > > multiple clusters as configuration options. Based > on > > these clusters > > > > it > > > > > > will > > > > > > > then reuse the connect workers and create as many > as > > necessary to > > > be > > > > > able > > > > > > > to replicate to/from each of those configured > clusters. > > It will > > > then > > > > > > > expose a rest api (since you stated subset of > Connect > > rest api I > > > > assume > > > > > > it > > > > > > > will be a new / own one?) that allows users to > send > > requests like > > > > > > > "replicate topic a from cluster 1 to cluster 1" and > > start a > > > connector > > > > > on > > > > > > > the relevant worker that can offer this "route". > > > > > > > This can be extended to a cluster by starting > mirror > > maker drivers > > > on > > > > > > other > > > > > > > nodes with the same config and it would offer all > the > > connect > > > > features > > > > > of > > > > > > > balancing restarting in case of failure etc. > > > > > > > > > > > > > > If this understanding is correct then it just > feels to > > me like an > > > > awful > > > > > > lot > > > > > > > of Connect functionality would need to be > reimplemented > > or at least > > > > > > > wrapped, which potentially could mean additional > effort > > for > > > > maintaining > > > > > > and > > > > > > > extending Connect down the line. Wouldn't it be a > viable > > > alternative > > > > to > > > > > > > look into extending Connect itself to allow > defining > > "remote > > > > clusters" > > > > > > > which can then be specified in the connector > config to > > be used > > > > instead > > > > > of > > > > > > > the local cluster? I imagine that change itself > would > > not be too > > > > > > extensive, > > > > > > > the main effort would probably be in coming up > with a > > sensible > > > config > > > > > > > structure and ensuring backwards compatibility with > > existing > > > > connector > > > > > > > configs. > > > > > > > This would still allow to use a regular Connect > cluster > > for an > > > > > arbitrary > > > > > > > number of clusters, thus still having a dedicated > > MirrorMaker > > > cluster > > > > > by > > > > > > > running only MirrorMaker Connectors in there if > you want > > the > > > > > isolation. I > > > > > > > agree that it would not offer the level of > abstraction > > around > > > > > replication > > > > > > > that your concept would enable to implement, but I > think > > if would > > > be > > > > > far > > > > > > > less implementation and maintenance effort. > > > > > > > > > > > > > > But again, all of that is based on my, potentially > > flawed, > > > > > understanding > > > > > > of > > > > > > > your proposal, please feel free to correct me :) > > > > > > > > > > > > > > Best regards, > > > > > > > Sönke > > > > > > > > > > > > > > On Fri, Nov 30, 2018 at 1:39 AM Ryanne Dolan < > > > ryannedo...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Sönke, thanks for the feedback! > > > > > > > > > > > > > > > > > the renaming policy [...] can be disabled > [...] The > > KIP itself > > > > > does > > > > > > > not > > > > > > > > mention this > > > > > > > > > > > > > > > > Good catch. I've updated the KIP to call this > out. > > > > > > > > > > > > > > > > > "MirrorMaker clusters" I am not sure I fully > > understand the > > > issue > > > > > you > > > > > > > > are trying to solve > > > > > > > > > > > > > > > > MirrorMaker today is not scalable from an > operational > > > perspective. > > > > > > Celia > > > > > > > > Kung at LinkedIn does a great job of explaining > this > > problem [1], > > > > > which > > > > > > > has > > > > > > > > caused LinkedIn to drop MirrorMaker in favor of > > Brooklin. With > > > > > > Brooklin, > > > > > > > a > > > > > > > > single cluster, single API, and single UI > controls > > replication > > > > flows > > > > > > for > > > > > > > an > > > > > > > > entire data center. With MirrorMaker 2.0, the > vision > > is much the > > > > > same. > > > > > > > > > > > > > > > > If your data center consists of a small number of > > Kafka clusters > > > > and > > > > > an > > > > > > > > existing Connect cluster, it might make more > sense to > > re-use the > > > > > > Connect > > > > > > > > cluster with MirrorSource/SinkConnectors. There's > > nothing wrong > > > > with > > > > > > this > > > > > > > > approach for small deployments, but this model > also > > doesn't > > > scale. > > > > > This > > > > > > > is > > > > > > > > because Connect clusters are built around a > single > > Kafka cluster > > > -- > > > > > > what > > > > > > > I > > > > > > > > call the "primary" cluster -- and all Connectors > in > > the cluster > > > > must > > > > > > > either > > > > > > > > consume from or produce to this single cluster. > If you > > have more > > > > than > > > > > > one > > > > > > > > "active" Kafka cluster in each data center, > you'll end > > up needing > > > > > > > multiple > > > > > > > > Connect clusters there as well. > > > > > > > > > > > > > > > > The problem with Connect clusters for > replication is > > way less > > > > severe > > > > > > > > compared to legacy MirrorMaker. Generally you > need one > > Connect > > > > > cluster > > > > > > > per > > > > > > > > active Kafka cluster. As you point out, MM2's > > SinkConnector means > > > > you > > > > > > can > > > > > > > > get away with a single Connect cluster for > topologies > > that center > > > > > > around > > > > > > > a > > > > > > > > single primary cluster. But each Connector > within each > > Connect > > > > > cluster > > > > > > > must > > > > > > > > be configured independently, with no high-level > view > > of your > > > > > > replication > > > > > > > > flows within and between data centers. > > > > > > > > > > > > > > > > With MirrorMaker 2.0, a single MirrorMaker > cluster > > manages > > > > > replication > > > > > > > > across any number of Kafka clusters. Much like > > Brooklin, MM2 does > > > > the > > > > > > > work > > > > > > > > of setting up connectors between clusters as > needed. > > This > > > > > > > > Replication-as-a-Service is a huge win for larger > > deployments, as > > > > > well > > > > > > as > > > > > > > > for organizations that haven't adopted Connect. > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://www.slideshare.net/ConfluentInc/more-data-more-problems-scaling-kafkamirroring-pipelines-at-linkedin > > > > > > > > > > > > > > > > Keep the questions coming! Thanks. > > > > > > > > Ryanne > > > > > > > > > > > > > > > > On Thu, Nov 29, 2018 at 3:30 AM Sönke Liebau < > > > > > > soenke.lie...@opencore.com > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > >> Hi Ryanne, > > > > > > > >> > > > > > > > >> first of all, thanks for the KIP, great work > overall > > and much > > > > > needed I > > > > > > > >> think! > > > > > > > >> > > > > > > > >> I have a small comment on the renaming policy, > in one > > of the > > > mails > > > > > on > > > > > > > >> this thread you mention that this can be > disabled (to > > replicate > > > > > topic1 > > > > > > > in > > > > > > > >> cluster A as topic1 on cluster B I assume). The > KIP > > itself does > > > > not > > > > > > > mention > > > > > > > >> this, from reading just the KIP one might get > the > > assumption > > > that > > > > > > > renaming > > > > > > > >> is mandatory. It might be useful to add a > sentence or > > two around > > > > > > > renaming > > > > > > > >> policies and what is possible here. I assume you > > intend to make > > > > > these > > > > > > > >> pluggable? > > > > > > > >> > > > > > > > >> Regarding the latest addition of "MirrorMaker > > clusters" I am not > > > > > sure > > > > > > I > > > > > > > >> fully understand the issue you are trying to > solve > > and what > > > > exactly > > > > > > > these > > > > > > > >> scripts will do - but that may just me being > dense > > about it :) > > > > > > > >> I understand the limitation to a single source > and > > target > > > cluster > > > > > that > > > > > > > >> Connect imposes, but isn't this worked around > by the > > fact that > > > you > > > > > > have > > > > > > > >> MirrorSource- and MirrorSinkConnectors and one > part > > of the > > > > equation > > > > > > will > > > > > > > >> always be under your control? > > > > > > > >> The way I understood your intention was that > there is > > a > > > (regular, > > > > > not > > > > > > > MM) > > > > > > > >> Connect Cluster somewhere next to a Kafka > Cluster A > > and if you > > > > > deploy > > > > > > a > > > > > > > >> MirrorSourceTask to that it will read messages > from a > > remote > > > > > cluster B > > > > > > > and > > > > > > > >> replicate them into the local cluster A. If you > > deploy a > > > > > > MirrorSinkTask > > > > > > > it > > > > > > > >> will read from local cluster A and replicate > into > > cluster B. > > > > > > > >> > > > > > > > >> Since in both causes the configuration for > cluster B > > will be > > > > passed > > > > > > into > > > > > > > >> the connector in the ConnectorConfig contained > in the > > rest > > > > request, > > > > > > > what's > > > > > > > >> to stop us from starting a third connector with > a > > > MirrorSourceTask > > > > > > > reading > > > > > > > >> from cluster C? > > > > > > > >> > > > > > > > >> I am a bit hesitant about the entire concept of > > having extra > > > > scripts > > > > > > to > > > > > > > >> run an entire separate Connect cluster - I'd > much > > prefer an > > > option > > > > > to > > > > > > > use a > > > > > > > >> regular connect cluster from an ops point of > view. Is > > it maybe > > > > worth > > > > > > > >> spending some time investigating whether we can > come > > up with a > > > > > change > > > > > > to > > > > > > > >> connect that enables what MM would need? > > > > > > > >> > > > > > > > >> Best regards, > > > > > > > >> Sönke > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> On Tue, Nov 27, 2018 at 10:02 PM Ryanne Dolan < > > > > > ryannedo...@gmail.com> > > > > > > > >> wrote: > > > > > > > >> > > > > > > > >>> Hey y'all, I'd like you draw your attention to > a new > > section in > > > > > > KIP-382 > > > > > > > >>> re > > > > > > > >>> MirrorMaker Clusters: > > > > > > > >>> > > > > > > > >>> > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-382:+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-MirrorMakerClusters > > > > > > > >>> > > > > > > > >>> A common concern I hear about using Connect for > > replication is > > > > that > > > > > > all > > > > > > > >>> SourceConnectors in a Connect cluster must use > the > > same target > > > > > Kafka > > > > > > > >>> cluster, and likewise all SinkConnectors must > use > > the same > > > source > > > > > > Kafka > > > > > > > >>> cluster. In order to use multiple Kafka > clusters > > from Connect, > > > > > there > > > > > > > are > > > > > > > >>> two possible approaches: > > > > > > > >>> > > > > > > > >>> 1) use an intermediate Kafka cluster, K. > > SourceConnectors (A, > > > B, > > > > C) > > > > > > > write > > > > > > > >>> to K and SinkConnectors (X, Y, Z) read from K. > This > > enables > > > flows > > > > > > like > > > > > > > A > > > > > > > >>> -> > > > > > > > >>> K - > X but means that some topologies require > > extraneous hops, > > > > and > > > > > > > means > > > > > > > >>> that K must be scaled to handle records from > all > > sources and > > > > sinks. > > > > > > > >>> > > > > > > > >>> 2) use multiple Connect clusters, one for each > > target cluster. > > > > Each > > > > > > > >>> cluster > > > > > > > >>> has multiple SourceConnectors, one for each > source > > cluster. > > > This > > > > > > > enables > > > > > > > >>> direct replication of A -> X but means there > is a > > proliferation > > > > of > > > > > > > >>> Connect > > > > > > > >>> clusters, each of which must be managed > separately. > > > > > > > >>> > > > > > > > >>> Both options are viable for small deployments > > involving a small > > > > > > number > > > > > > > of > > > > > > > >>> Kafka clusters in a small number of data > centers. > > However, > > > > neither > > > > > is > > > > > > > >>> scalable, especially from an operational > standpoint. > > > > > > > >>> > > > > > > > >>> KIP-382 now introduces "MirrorMaker clusters", > which > > are > > > distinct > > > > > > from > > > > > > > >>> Connect clusters. A single MirrorMaker cluster > > provides > > > > > > > >>> "Replication-as-a-Service" among any number of > Kafka > > clusters > > > > via a > > > > > > > >>> high-level REST API based on the Connect API. > Under > > the hood, > > > > > > > MirrorMaker > > > > > > > >>> sets up Connectors between each pair of Kafka > > clusters. The > > > REST > > > > > API > > > > > > > >>> enables on-the-fly reconfiguration of each > > Connector, including > > > > > > updates > > > > > > > >>> to > > > > > > > >>> topic whitelists/blacklists. > > > > > > > >>> > > > > > > > >>> To configure MirrorMaker 2.0, you need a > > configuration file > > > that > > > > > > lists > > > > > > > >>> connection information for each Kafka cluster > > (broker lists, > > > SSL > > > > > > > settings > > > > > > > >>> etc). At a minimum, this looks like: > > > > > > > >>> > > > > > > > >>> clusters=us-west, us-east > > > > > > > >>> > cluster.us-west.broker.list=us-west-kafka-server:9092 > > > > > > > >>> > cluster.us-east.broker.list=us-east-kafka-server:9092 > > > > > > > >>> > > > > > > > >>> You can specify topic whitelists and other > > connector-level > > > > settings > > > > > > > here > > > > > > > >>> too, or you can use the REST API to > remote-control a > > running > > > > > cluster. > > > > > > > >>> > > > > > > > >>> I've also updated the KIP with minor changes to > > bring it in > > > line > > > > > with > > > > > > > the > > > > > > > >>> current implementation. > > > > > > > >>> > > > > > > > >>> Looking forward to your feedback, thanks! > > > > > > > >>> Ryanne > > > > > > > >>> > > > > > > > >>> On Mon, Nov 19, 2018 at 10:26 PM Ryanne Dolan < > > > > > ryannedo...@gmail.com > > > > > > > > > > > > > > >>> wrote: > > > > > > > >>> > > > > > > > >>> > Dan, you've got it right. ACL sync will be > done by > > MM2 > > > > > > automatically > > > > > > > >>> > (unless disabled) according to simple rules: > > > > > > > >>> > > > > > > > > >>> > - If a principal has READ access on a topic > in a > > source > > > > cluster, > > > > > > the > > > > > > > >>> same > > > > > > > >>> > principal should have READ access on > downstream > > replicated > > > > topics > > > > > > > >>> ("remote > > > > > > > >>> > topics"). > > > > > > > >>> > - Only MM2 has WRITE access on "remote > topics". > > > > > > > >>> > > > > > > > > >>> > This covers sync from upstream topics like > > "topic1" to > > > > downstream > > > > > > > >>> remote > > > > > > > >>> > topics like "us-west.topic1". What's missing > from > > the KIP, as > > > > you > > > > > > > point > > > > > > > >>> > out, is ACL sync between normal topics > > (non-remote). If a > > > > > consumer > > > > > > > has > > > > > > > >>> READ > > > > > > > >>> > access to topic1 in an upstream cluster, > should it > > have READ > > > > > access > > > > > > > in > > > > > > > >>> > topic1 in a downstream cluster? > > > > > > > >>> > > > > > > > > >>> > I think the answer generally is no, you > don't want > > to give > > > > > > principals > > > > > > > >>> > blanket permissions across all DCs > automatically. > > For > > > example, > > > > > I've > > > > > > > >>> seen > > > > > > > >>> > scenarios where certain topics are replicated > > between an > > > > internal > > > > > > and > > > > > > > >>> > external Kafka cluster. You don't want to > > accidentally push > > > ACL > > > > > > > changes > > > > > > > >>> > across this boundary. > > > > > > > >>> > > > > > > > > >>> > Moreover, it's clear that MM2 "owns" > downstream > > remote topics > > > > > like > > > > > > > >>> > "us-west.topic1" -- MM2 is the only producer > and > > the only > > > admin > > > > > of > > > > > > > >>> these > > > > > > > >>> > topics -- so it's natural to have MM2 set > the ACL > > for these > > > > > topics. > > > > > > > >>> But I > > > > > > > >>> > think it would be surprising if MM2 tried to > > manipulate > > > topics > > > > it > > > > > > > >>> doesn't > > > > > > > >>> > own. So I think granting permissions across > DCs is > > probably > > > > > outside > > > > > > > >>> MM2's > > > > > > > >>> > purview, but I agree it'd be nice to have > tooling > > to help > > > with > > > > > > this. > > > > > > > >>> > > > > > > > > >>> > Thanks. > > > > > > > >>> > Ryanne > > > > > > > >>> > > > > > > > > >>> > -- > > > > > > > >>> > www.ryannedolan.info > > > > > > > >>> > > > > > > > > >>> > > > > > > > > >>> > On Mon, Nov 19, 2018 at 3:58 PM > > daniel.loci...@gmail.com < > > > > > > > >>> > daniel.loci...@gmail.com> wrote: > > > > > > > >>> > > > > > > > > >>> >> Hi guys, > > > > > > > >>> >> > > > > > > > >>> >> This is an exciting topic. could I have a > word > > here? > > > > > > > >>> >> I understand there are many scenarios that > we can > > apply > > > > > > mirrormaker. > > > > > > > >>> I am > > > > > > > >>> >> at the moment working on active/active DC > > solution using > > > > > > > MirrorMaker; > > > > > > > >>> our > > > > > > > >>> >> goal is to allow the clients to failover to > > connect the > > > other > > > > > > kafka > > > > > > > >>> >> cluster (on the other DC) when an incident > > happens. > > > > > > > >>> >> > > > > > > > >>> >> To do this, I need: > > > > > > > >>> >> 1 MirrorMaker to replicate the partitioned > > messages in a > > > > > > sequential > > > > > > > >>> order > > > > > > > >>> >> (in timely fashion) to the same partition > on the > > other > > > cluster > > > > > > (also > > > > > > > >>> need > > > > > > > >>> >> keep the promise that both clusters creates > the > > same number > > > of > > > > > > > >>> partitions > > > > > > > >>> >> for a topic) – so that a consumer can pick > up the > > right > > > order > > > > of > > > > > > the > > > > > > > >>> latest > > > > > > > >>> >> messages > > > > > > > >>> >> 2 MirrorMaker to replicate the local > consumer > > offset to the > > > > > other > > > > > > > >>> side – > > > > > > > >>> >> so that the consumer knows where is the > offset/ > > latest > > > > messages > > > > > > > >>> >> 3 MirrorMaker to provide cycle detection for > > messages across > > > > the > > > > > > > DCs. > > > > > > > >>> >> > > > > > > > >>> >> I can see the possibility for Remote Topic > to > > solve all > > > these > > > > > > > >>> problems, > > > > > > > >>> >> as long as the consumer can see the remote > topic > > equally as > > > > the > > > > > > > local > > > > > > > >>> >> topic, i.e. For a consumer which has a > permission > > to consume > > > > > > topic1, > > > > > > > >>> on > > > > > > > >>> >> subscribe event it can automatically > subscribe > > both > > > > > remote.topic1 > > > > > > > and > > > > > > > >>> >> local.topic1. First we need to find a way > for > > topic ACL > > > > granting > > > > > > for > > > > > > > >>> the > > > > > > > >>> >> consumer across the DCs. Secondly the > consumer > > need to be > > > able > > > > > to > > > > > > > >>> subscribe > > > > > > > >>> >> topics with wildcard or suffix. Last but > not the > > least, the > > > > > > consumer > > > > > > > >>> has to > > > > > > > >>> >> deal with the timely ordering of the > messages > > from the 2 > > > > topics. > > > > > > > >>> >> > > > > > > > >>> >> My understanding is, all of these should be > > configurable to > > > be > > > > > > > turned > > > > > > > >>> on > > > > > > > >>> >> or off, to fit for different use cases. > > > > > > > >>> >> > > > > > > > >>> >> Interesting I was going to support topic > messages > > with extra > > > > > > headers > > > > > > > >>> of > > > > > > > >>> >> source DC info, for cycle detection….. > > > > > > > >>> >> > > > > > > > >>> >> Looking forward your reply. > > > > > > > >>> >> > > > > > > > >>> >> Regards, > > > > > > > >>> >> > > > > > > > >>> >> Dan > > > > > > > >>> >> On 2018/10/23 19:56:02, Ryanne Dolan < > > ryannedo...@gmail.com > > > > > > > > > > wrote: > > > > > > > >>> >> > Alex, thanks for the feedback. > > > > > > > >>> >> > > > > > > > > >>> >> > > Would it be possible to utilize the > > > > > > > >>> >> > > Message Headers feature to prevent > infinite > > recursion > > > > > > > >>> >> > > > > > > > > >>> >> > This isn't necessary due to the topic > renaming > > feature > > > which > > > > > > > already > > > > > > > >>> >> > prevents infinite recursion. > > > > > > > >>> >> > > > > > > > > >>> >> > If you turn off topic renaming you lose > cycle > > detection, > > > so > > > > > > maybe > > > > > > > we > > > > > > > >>> >> could > > > > > > > >>> >> > provide message headers as an optional > second > > mechanism. > > > I'm > > > > > not > > > > > > > >>> >> opposed to > > > > > > > >>> >> > that idea, but there are ways to improve > > efficiency if we > > > > > don't > > > > > > > >>> need to > > > > > > > >>> >> > modify or inspect individual records. > > > > > > > >>> >> > > > > > > > > >>> >> > Ryanne > > > > > > > >>> >> > > > > > > > > >>> >> > On Tue, Oct 23, 2018 at 6:06 AM Alex > Mironov < > > > > > > > alexandr...@gmail.com > > > > > > > >>> > > > > > > > > >>> >> wrote: > > > > > > > >>> >> > > > > > > > > >>> >> > > Hey Ryanne, > > > > > > > >>> >> > > > > > > > > > >>> >> > > Awesome KIP, exited to see improvements > in > > MirrorMaker > > > > > land, I > > > > > > > >>> >> particularly > > > > > > > >>> >> > > like the reuse of Connect framework! > Would it > > be > > > possible > > > > to > > > > > > > >>> utilize > > > > > > > >>> >> the > > > > > > > >>> >> > > Message Headers feature to prevent > infinite > > recursion? > > > For > > > > > > > >>> example, > > > > > > > >>> >> MM2 > > > > > > > >>> >> > > could stamp every message with a special > > header payload > > > > > (e.g. > > > > > > > >>> >> > > MM2="cluster-name-foo") so in case > another > > MM2 instance > > > > sees > > > > > > > this > > > > > > > >>> >> message > > > > > > > >>> >> > > and it is configured to replicate data > into > > > > > "cluster-name-foo" > > > > > > > it > > > > > > > >>> >> would > > > > > > > >>> >> > > just skip it instead of replicating it > back. > > > > > > > >>> >> > > > > > > > > > >>> >> > > On Sat, Oct 20, 2018 at 5:48 AM Ryanne > Dolan < > > > > > > > >>> ryannedo...@gmail.com> > > > > > > > >>> >> > > wrote: > > > > > > > >>> >> > > > > > > > > > >>> >> > > > Thanks Harsha. Done. > > > > > > > >>> >> > > > > > > > > > > >>> >> > > > On Fri, Oct 19, 2018 at 1:03 AM Harsha > > Chintalapani < > > > > > > > >>> >> ka...@harsha.io> > > > > > > > >>> >> > > > wrote: > > > > > > > >>> >> > > > > > > > > > > >>> >> > > > > Ryanne, > > > > > > > >>> >> > > > > Makes sense. Can you please > add > > this under > > > > > rejected > > > > > > > >>> >> alternatives > > > > > > > >>> >> > > > so > > > > > > > >>> >> > > > > that everyone has context on why it > > wasn’t picked. > > > > > > > >>> >> > > > > > > > > > > > >>> >> > > > > Thanks, > > > > > > > >>> >> > > > > Harsha > > > > > > > >>> >> > > > > On Oct 18, 2018, 8:02 AM -0700, > Ryanne > > Dolan < > > > > > > > >>> >> ryannedo...@gmail.com>, > > > > > > > >>> >> > > > > wrote: > > > > > > > >>> >> > > > > > > > > > > > >>> >> > > > > Harsha, concerning uReplicator > > specifically, the > > > > project > > > > > > is > > > > > > > a > > > > > > > >>> >> major > > > > > > > >>> >> > > > > inspiration for MM2, but I don't > think it > > is a good > > > > > > > >>> foundation for > > > > > > > >>> >> > > > anything > > > > > > > >>> >> > > > > included in Apache Kafka. > uReplicator > > uses Helix to > > > > > solve > > > > > > > >>> >> problems that > > > > > > > >>> >> > > > > Connect also solves, e.g. REST API, > live > > > configuration > > > > > > > >>> changes, > > > > > > > >>> >> cluster > > > > > > > >>> >> > > > > management, coordination etc. This > also > > means that > > > > > > existing > > > > > > > >>> >> tooling, > > > > > > > >>> >> > > > > dashboards etc that work with > Connectors > > do not work > > > > > with > > > > > > > >>> >> uReplicator, > > > > > > > >>> >> > > > and > > > > > > > >>> >> > > > > any future tooling would need to > treat > > uReplicator > > > as > > > > a > > > > > > > >>> special > > > > > > > >>> >> case. > > > > > > > >>> >> > > > > > > > > > > > >>> >> > > > > Ryanne > > > > > > > >>> >> > > > > > > > > > > > >>> >> > > > > On Wed, Oct 17, 2018 at 12:30 PM > Ryanne > > Dolan < > > > > > > > >>> >> ryannedo...@gmail.com> > > > > > > > >>> >> > > > > wrote: > > > > > > > >>> >> > > > > > > > > > > > >>> >> > > > >> Harsha, yes I can do that. I'll > update > > the KIP > > > > > > accordingly, > > > > > > > >>> >> thanks. > > > > > > > >>> >> > > > >> > > > > > > > >>> >> > > > >> Ryanne > > > > > > > >>> >> > > > >> > > > > > > > >>> >> > > > >> On Wed, Oct 17, 2018 at 12:18 PM > Harsha < > > > > > ka...@harsha.io > > > > > > > > > > > > > > >>> wrote: > > > > > > > >>> >> > > > >> > > > > > > > >>> >> > > > >>> Hi Ryanne, > > > > > > > >>> >> > > > >>> Thanks for the > KIP. I am > > also > > > curious > > > > > > about > > > > > > > >>> why > > > > > > > >>> >> not > > > > > > > >>> >> > > use > > > > > > > >>> >> > > > >>> the uReplicator design as the > > foundation given it > > > > > > alreadys > > > > > > > >>> >> resolves > > > > > > > >>> >> > > > some of > > > > > > > >>> >> > > > >>> the fundamental issues in current > > MIrrorMaker, > > > > > updating > > > > > > > the > > > > > > > >>> >> confifgs > > > > > > > >>> >> > > > on the > > > > > > > >>> >> > > > >>> fly and running the mirror maker > agents > > in a > > > worker > > > > > > model > > > > > > > >>> which > > > > > > > >>> >> can > > > > > > > >>> >> > > > >>> deployed in mesos or container > > orchestrations. If > > > > > > > possible > > > > > > > >>> can > > > > > > > >>> >> you > > > > > > > >>> >> > > > >>> document in the rejected > alternatives > > what are > > > > missing > > > > > > > parts > > > > > > > >>> >> that > > > > > > > >>> >> > > made > > > > > > > >>> >> > > > you > > > > > > > >>> >> > > > >>> to consider a new design from > ground up. > > > > > > > >>> >> > > > >>> > > > > > > > >>> >> > > > >>> Thanks, > > > > > > > >>> >> > > > >>> Harsha > > > > > > > >>> >> > > > >>> > > > > > > > >>> >> > > > >>> On Wed, Oct 17, 2018, at 8:34 AM, > > Ryanne Dolan > > > > wrote: > > > > > > > >>> >> > > > >>> > Jan, these are two separate > issues. > > > > > > > >>> >> > > > >>> > > > > > > > > >>> >> > > > >>> > 1) consumer coordination should > not, > > ideally, > > > > > involve > > > > > > > >>> >> unreliable or > > > > > > > >>> >> > > > >>> slow > > > > > > > >>> >> > > > >>> > connections. Naively, a > > KafkaSourceConnector > > > would > > > > > > > >>> coordinate > > > > > > > >>> >> via > > > > > > > >>> >> > > the > > > > > > > >>> >> > > > >>> > source cluster. We can do > better than > > this, but > > > > I'm > > > > > > > >>> deferring > > > > > > > >>> >> this > > > > > > > >>> >> > > > >>> > optimization for now. > > > > > > > >>> >> > > > >>> > > > > > > > > >>> >> > > > >>> > 2) exactly-once between two > clusters > > is > > > > > mind-bending. > > > > > > > But > > > > > > > >>> >> keep in > > > > > > > >>> >> > > > mind > > > > > > > >>> >> > > > >>> that > > > > > > > >>> >> > > > >>> > transactions are managed by the > > producer, not > > > the > > > > > > > >>> consumer. In > > > > > > > >>> >> > > fact, > > > > > > > >>> >> > > > >>> it's > > > > > > > >>> >> > > > >>> > the producer that requests that > > offsets be > > > > committed > > > > > > for > > > > > > > >>> the > > > > > > > >>> >> > > current > > > > > > > >>> >> > > > >>> > transaction. Obviously, these > offsets > > are > > > > committed > > > > > in > > > > > > > >>> >> whatever > > > > > > > >>> >> > > > >>> cluster the > > > > > > > >>> >> > > > >>> > producer is sending to. > > > > > > > >>> >> > > > >>> > > > > > > > > >>> >> > > > >>> > These two issues are closely > related. > > They are > > > > both > > > > > > > >>> resolved > > > > > > > >>> >> by not > > > > > > > >>> >> > > > >>> > coordinating or committing via > the > > source > > > cluster. > > > > > And > > > > > > > in > > > > > > > >>> >> fact, > > > > > > > >>> >> > > this > > > > > > > >>> >> > > > >>> is the > > > > > > > >>> >> > > > >>> > general model of > SourceConnectors > > anyway, since > > > > most > > > > > > > >>> >> > > SourceConnectors > > > > > > > >>> >> > > > >>> > _only_ have a destination > cluster. > > > > > > > >>> >> > > > >>> > > > > > > > > >>> >> > > > >>> > If there is a lot of interest > here, I > > can > > > expound > > > > > > > further > > > > > > > >>> on > > > > > > > >>> >> this > > > > > > > >>> >> > > > >>> aspect of > > > > > > > >>> >> > > > >>> > MM2, but again I think this is > > premature until > > > > this > > > > > > > first > > > > > > > >>> KIP > > > > > > > >>> >> is > > > > > > > >>> >> > > > >>> approved. > > > > > > > >>> >> > > > >>> > I intend to address each of > these in > > separate > > > KIPs > > > > > > > >>> following > > > > > > > >>> >> this > > > > > > > >>> >> > > > one. > > > > > > > >>> >> > > > >>> > > > > > > > > >>> >> > > > >>> > Ryanne > > > > > > > >>> >> > > > >>> > > > > > > > > >>> >> > > > >>> > On Wed, Oct 17, 2018 at 7:09 AM > Jan > > Filipiak < > > > > > > > >>> >> > > > jan.filip...@trivago.com > > > > > > > >>> >> > > > >>> > > > > > > > > >>> >> > > > >>> > wrote: > > > > > > > >>> >> > > > >>> > > > > > > > > >>> >> > > > >>> > > This is not a performance > > optimisation. Its a > > > > > > > >>> fundamental > > > > > > > >>> >> design > > > > > > > >>> >> > > > >>> choice. > > > > > > > >>> >> > > > >>> > > > > > > > > > >>> >> > > > >>> > > > > > > > > > >>> >> > > > >>> > > I never really took a look how > > streams does > > > > > exactly > > > > > > > >>> once. > > > > > > > >>> >> (its a > > > > > > > >>> >> > > > trap > > > > > > > >>> >> > > > >>> > > anyways and you usually can > deal > > with at least > > > > > once > > > > > > > >>> >> donwstream > > > > > > > >>> >> > > > pretty > > > > > > > >>> >> > > > >>> > > easy). But I am very certain > its > > not gonna get > > > > > > > >>> somewhere if > > > > > > > >>> >> > > offset > > > > > > > >>> >> > > > >>> > > commit and record produce > cluster > > are not the > > > > > same. > > > > > > > >>> >> > > > >>> > > > > > > > > > >>> >> > > > >>> > > Pretty sure without this > _design > > choice_ you > > > can > > > > > > skip > > > > > > > on > > > > > > > >>> >> that > > > > > > > >>> >> > > > exactly > > > > > > > >>> >> > > > >>> > > once already > > > > > > > >>> >> > > > >>> > > > > > > > > > >>> >> > > > >>> > > Best Jan > > > > > > > >>> >> > > > >>> > > > > > > > > > >>> >> > > > >>> > > On 16.10.2018 18:16, Ryanne > Dolan > > wrote: > > > > > > > >>> >> > > > >>> > > > > But one big obstacle in > this > > was > > > > > > > >>> >> > > > >>> > > > always that group > coordination > > happened on > > > the > > > > > > > source > > > > > > > >>> >> cluster. > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > Jan, thank you for bringing > up > > this issue > > > with > > > > > > > legacy > > > > > > > >>> >> > > > MirrorMaker. > > > > > > > >>> >> > > > >>> I > > > > > > > >>> >> > > > >>> > > > totally agree with you. > This is > > one of > > > several > > > > > > > >>> problems > > > > > > > >>> >> with > > > > > > > >>> >> > > > >>> MirrorMaker > > > > > > > >>> >> > > > >>> > > > I intend to solve in MM2, > and I > > already > > > have a > > > > > > > design > > > > > > > >>> and > > > > > > > >>> >> > > > >>> prototype that > > > > > > > >>> >> > > > >>> > > > solves this and related > issues. > > But as you > > > > > pointed > > > > > > > >>> out, > > > > > > > >>> >> this > > > > > > > >>> >> > > KIP > > > > > > > >>> >> > > > is > > > > > > > >>> >> > > > >>> > > > already rather complex, and > I > > want to focus > > > on > > > > > the > > > > > > > >>> core > > > > > > > >>> >> feature > > > > > > > >>> >> > > > set > > > > > > > >>> >> > > > >>> > > > rather than performance > > optimizations for > > > now. > > > > > If > > > > > > we > > > > > > > >>> can > > > > > > > >>> >> agree > > > > > > > >>> >> > > on > > > > > > > >>> >> > > > >>> what > > > > > > > >>> >> > > > >>> > > > MM2 looks like, it will be > very > > easy to > > > agree > > > > to > > > > > > > >>> improve > > > > > > > >>> >> its > > > > > > > >>> >> > > > >>> performance > > > > > > > >>> >> > > > >>> > > > and reliability. > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > That said, I look forward > to your > > support > > > on a > > > > > > > >>> subsequent > > > > > > > >>> >> KIP > > > > > > > >>> >> > > > that > > > > > > > >>> >> > > > >>> > > > addresses consumer > coordination > > and > > > rebalance > > > > > > > issues. > > > > > > > >>> Stay > > > > > > > >>> >> > > tuned! > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > Ryanne > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > On Tue, Oct 16, 2018 at > 6:58 AM > > Jan > > > Filipiak < > > > > > > > >>> >> > > > >>> jan.filip...@trivago.com > > > > > > > >>> >> > > > >>> > > > <mailto: > jan.filip...@trivago.com>> > > wrote: > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > Hi, > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > Currently MirrorMaker is > > usually run > > > > > > collocated > > > > > > > >>> with > > > > > > > >>> >> the > > > > > > > >>> >> > > > target > > > > > > > >>> >> > > > >>> > > > cluster. > > > > > > > >>> >> > > > >>> > > > This is all nice and > good. > > But one big > > > > > > obstacle > > > > > > > in > > > > > > > >>> >> this was > > > > > > > >>> >> > > > >>> > > > always that group > > coordination happened > > > on > > > > > the > > > > > > > >>> source > > > > > > > >>> >> > > > cluster. > > > > > > > >>> >> > > > >>> So > > > > > > > >>> >> > > > >>> > > when > > > > > > > >>> >> > > > >>> > > > then network was > congested, > > you > > > sometimes > > > > > > loose > > > > > > > >>> group > > > > > > > >>> >> > > > >>> membership and > > > > > > > >>> >> > > > >>> > > > have to rebalance and > all > > this. > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > So one big request from > we > > would be the > > > > > > support > > > > > > > of > > > > > > > >>> >> having > > > > > > > >>> >> > > > >>> > > coordination > > > > > > > >>> >> > > > >>> > > > cluster != source > cluster. > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > I would generally say a > LAN > > is better > > > > than a > > > > > > WAN > > > > > > > >>> for > > > > > > > >>> >> doing > > > > > > > >>> >> > > > >>> group > > > > > > > >>> >> > > > >>> > > > coordinaton and there > is no > > reason we > > > > > couldn't > > > > > > > >>> have a > > > > > > > >>> >> group > > > > > > > >>> >> > > > >>> consuming > > > > > > > >>> >> > > > >>> > > > topics from a different > > cluster and > > > > > committing > > > > > > > >>> >> offsets to > > > > > > > >>> >> > > > >>> another > > > > > > > >>> >> > > > >>> > > > one right? > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > Other than that. It > feels > > like the KIP > > > has > > > > > too > > > > > > > >>> much > > > > > > > >>> >> > > features > > > > > > > >>> >> > > > >>> where > > > > > > > >>> >> > > > >>> > > many > > > > > > > >>> >> > > > >>> > > > of them are not really > wanted > > and > > > counter > > > > > > > >>> productive > > > > > > > >>> >> but I > > > > > > > >>> >> > > > >>> will just > > > > > > > >>> >> > > > >>> > > > wait and see how the > > discussion goes. > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > Best Jan > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > On 15.10.2018 18:16, > Ryanne > > Dolan wrote: > > > > > > > >>> >> > > > >>> > > > > Hey y'all! > > > > > > > >>> >> > > > >>> > > > > > > > > > > > >>> >> > > > >>> > > > > Please take a look at > > KIP-382: > > > > > > > >>> >> > > > >>> > > > > > > > > > > > >>> >> > > > >>> > > > > > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > > > > > > > >>> >> > > > >>> > > > > > > > >>> >> > > > > > > > > > > >>> >> > > > > > > > > > >>> >> > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0 > > > > > > > >>> >> > > > >>> > > > > > > > > > > > >>> >> > > > >>> > > > > Thanks for your > feedback > > and support. > > > > > > > >>> >> > > > >>> > > > > > > > > > > > >>> >> > > > >>> > > > > Ryanne > > > > > > > >>> >> > > > >>> > > > > > > > > > > > >>> >> > > > >>> > > > > > > > > > > >>> >> > > > >>> > > > > > > > > > >>> >> > > > >>> > > > > > > > >>> >> > > > >> > > > > > > > >>> >> > > > > > > > > > > >>> >> > > > > > > > > > >>> >> > > > > > > > > > >>> >> > > -- > > > > > > > >>> >> > > Best, > > > > > > > >>> >> > > Alex Mironov > > > > > > > >>> >> > > > > > > > > > >>> >> > > > > > > > > >>> >> > > > > > > > >>> > > > > > > > > >>> > > > > > > > >> > > > > > > > >> > > > > > > > >> -- > > > > > > > >> Sönke Liebau > > > > > > > >> Partner > > > > > > > >> Tel. +49 179 7940878 > > > > > > > >> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - > 22880 > > Wedel - > > > > > Germany > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Sönke Liebau > > > > > > > Partner > > > > > > > Tel. +49 179 7940878 > > > > > > > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - > 22880 > > Wedel - > > > Germany > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Sönke Liebau > > > > > Partner > > > > > Tel. +49 179 7940878 > > > > > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 > Wedel > > - Germany > > > > > > > > > > > > > > > > > > -- > > > Sönke Liebau > > > Partner > > > Tel. +49 179 7940878 > > > OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 > Wedel - > > Germany > > > > > > > > > > > > > The information contained in this email is strictly confidential and > for > > the use of the addressee only, unless otherwise indicated. If you > are not > > the intended recipient, please do not read, copy, use or disclose to > others > > this message or any attachment. Please also notify the sender by > replying > > to this email or by telephone (+44(020 7896 0011) and then delete > the email > > and any copies of it. Opinions, conclusion (etc) that do not relate > to the > > official business of this company shall be understood as neither > given nor > > endorsed by it. IG is a trading name of IG Markets Limited (a company > > registered in England and Wales, company number 04008957) and IG > Index > > Limited (a company registered in England and Wales, company number > > 01190902). Registered address at Cannon Bridge House, 25 Dowgate > Hill, > > London EC4R 2YA. Both IG Markets Limited (register number 195355) > and IG > > Index Limited (register number 114059) are authorised and regulated > by the > > Financial Conduct Authority. > > > > > The information contained in this email is strictly confidential and for > the use of the addressee only, unless otherwise indicated. If you are not > the intended recipient, please do not read, copy, use or disclose to others > this message or any attachment. Please also notify the sender by replying > to this email or by telephone (+44(020 7896 0011) and then delete the email > and any copies of it. Opinions, conclusion (etc) that do not relate to the > official business of this company shall be understood as neither given nor > endorsed by it. IG is a trading name of IG Markets Limited (a company > registered in England and Wales, company number 04008957) and IG Index > Limited (a company registered in England and Wales, company number > 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill, > London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG > Index Limited (register number 114059) are authorised and regulated by the > Financial Conduct Authority. >