Hi Chris, I went through the KIP and updated it based on our discussion. I think your suggestions simplified (and shortened) the KIP significantly.
Thanks, Daniel Dániel Urbán <dur...@cloudera.com.invalid> ezt írta (időpont: 2022. szept. 16., P, 15:15): > Hi Chris, > > 1. For the REST-server-per-flow setup, it made sense to introduce some > simplified configuration. With a single REST server, it doesn't make sense > anymore, I'm removing it from the KIP. > 2. I think that changing the worker ID generation still makes sense, > otherwise there is no way to differentiate between the MM2 instances. > > Thanks, > Daniel > > On Wed, Aug 31, 2022 at 8:39 PM Chris Egerton <chr...@aiven.io.invalid> > wrote: > > > Hi Daniel, > > > > I've taken a look at the KIP in detail. Here are my complete thoughts > > (minus the aforementioned sections that may be affected by changes to an > > internal-only REST API): > > > > 1. Why introduce new mm.host.name and mm.rest.protocol properties > instead > > of using the properties that are already used by Kafka Connect: > listeners, > > rest.advertised.host.name, rest.advertised.host.port, and > > rest.advertised.listener? We used to have the rest.host.name and > rest.port > > properties in Connect but deprecated and eventually removed them in favor > > of the listeners property in KIP-208 [1]; I'm hoping we can keep things > as > > similar as possible between MM2 and Connect in order to make it easier > for > > users to work with both. I'm also hoping that we can allow users to > > configure the port that their MM2 nodes listen on instead of hardcoding > MM2 > > to bind to port 0. > > > > 2. Do we still need to change the worker IDs that get used in the status > > topic? > > > > Everything else looks good, or should change once the KIP is updated with > > the internal-only REST API alternative. > > > > Cheers, > > > > Chris > > > > [1] - > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-208%3A+Add+SSL+support+to+Kafka+Connect+REST+interface > > > > On Mon, Aug 29, 2022 at 1:55 PM Chris Egerton <chr...@aiven.io> wrote: > > > > > Hi Daniel, > > > > > > Yeah, I think that's the way to go. Adding multiple servers for each > > > herder seems like it'd be too much of a pain for users to configure, > and > > if > > > we keep the API strictly internal for now, we shouldn't be painting > > > ourselves into too much of a corner if/when we decide to expose a > > > public-facing REST API for dedicated MM2 clusters. > > > > > > I plan to take a look at the rest of the KIP and provide a complete > > review > > > sometime this week; I'll hold off on commenting on anything that seems > > like > > > it'll be affected by switching to an internal-only REST API until those > > > changes are published, but should be able to review everything else. > > > > > > Cheers, > > > > > > Chris > > > > > > On Mon, Aug 29, 2022 at 6:57 AM Dániel Urbán <urb.dani...@gmail.com> > > > wrote: > > > > > >> Hi Chris, > > >> > > >> I understand your point, sounds good to me. > > >> So in short, we should opt for an internal-only API, and preferably a > > >> single server solution. Is that right? > > >> > > >> Thanks > > >> Daniel > > >> > > >> Chris Egerton <chr...@aiven.io.invalid> ezt írta (időpont: 2022. aug. > > >> 26., > > >> P, 17:36): > > >> > > >> > Hi Daniel, > > >> > > > >> > Glad to hear from you! > > >> > > > >> > With regards to the stripped-down REST API alternative, I don't see > > how > > >> > this would prevent us from introducing the fully-fledged Connect > REST > > >> API, > > >> > or even an augmented variant of it, at some point down the road. If > we > > >> go > > >> > with the internal-only API now, and want to expand later, can't we > > gate > > >> the > > >> > expansion behind a feature flag configuration property that by > default > > >> > disables the new feature? > > >> > > > >> > I'm also not sure that we'd ever want to expose the raw Connect REST > > API > > >> > for dedicated MM2 clusters. If people want that capability, they can > > >> > already spin up a vanilla Connect cluster and run as many MM2 > > >> connectors as > > >> > they'd like on it, and as of KIP-458 [1], it's even possible to use > a > > >> > single Connect cluster to replicate between any two Kafka clusters > > >> instead > > >> > of only targeting the Kafka cluster that the vanilla Connect cluster > > >> > operates on top of. I do agree that it'd be great to be able to > > >> dynamically > > >> > adjust things like topic filters without having to restart a > dedicated > > >> MM2 > > >> > node; I'm just not sure that the vanilla Connect REST API is the > > >> > appropriate way to do that, especially since the exact mechanisms > that > > >> make > > >> > a single Connect cluster viable for replicating across any two Kafka > > >> > clusters could be abused and cause a dedicated MM2 cluster to start > > >> writing > > >> > to a completely different Kafka cluster that's not even defined in > its > > >> > config file. > > >> > > > >> > Finally, as far as security goes--since this is essentially a bug > fix, > > >> I'm > > >> > inclined to make it as easy as possible for users to adopt it. MTLS > > is a > > >> > great start for securing a REST API, but it's not sufficient on its > > own > > >> > since anyone who could issue an authenticated REST request against > the > > >> MM2 > > >> > cluster would still be able to make any changes they want (with the > > >> > exception of accessing internal endpoints, which were secured with > > >> > KIP-507). If we were to bring up the fully-fledged Connect REST API, > > >> > cluster administrators would also likely have to add some kind of > > >> > authorization layer to prevent people from using the REST API to > mess > > >> with > > >> > the configurations of the connectors that MM2 brought up. One way of > > >> doing > > >> > that is to add a REST extension to your Connect cluster, but > > >> implementing > > >> > and configuring one in order to be able to run a multi-node MM2 > > cluster > > >> > without hitting this bug seems like too much work to be worth it. > > >> > > > >> > I think if we had a better picture of what a REST API for dedicated > > MM2 > > >> > clusters would/should look like, then it would be easier to go along > > >> with > > >> > this, and we could even just add the feature flag in this KIP right > > now > > >> to > > >> > address any security concerns. My instinct would be to address this > > in a > > >> > follow-up KIP in order to reduce scope creep, though, and keep this > > KIP > > >> > focused on addressing the bug with multi-node dedicated MM2 > clusters. > > >> What > > >> > do you think? > > >> > > > >> > Cheers, > > >> > > > >> > Chris > > >> > > > >> > [1] - > > >> > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-458%3A+Connector+Client+Config+Override+Policy > > >> > > > >> > On Thu, Aug 25, 2022 at 3:55 AM Dániel Urbán <urb.dani...@gmail.com > > > > >> > wrote: > > >> > > > >> > > Hi Chris, > > >> > > > > >> > > Thanks for bringing this up again :) > > >> > > > > >> > > 1. I think that is reasonable, though I find the current state of > > MM2 > > >> to > > >> > be > > >> > > confusing. The current issue with the distributed mode is not > > >> documented > > >> > > properly, but maybe the logging will help a bit. > > >> > > > > >> > > 2. Going for an internal-only Connect REST version would lock MM2 > > out > > >> of > > >> > a > > >> > > path where the REST API can be used to dynamically reconfigure > > >> > > replications. For now, I agree, it would be easy to corrupt the > > state > > >> of > > >> > > MM2 if someone wanted to use the properties and the REST at the > same > > >> > time, > > >> > > but in the future, we might have a chance to introduce a different > > >> config > > >> > > mechanism, where only the cluster connections have to be specified > > in > > >> the > > >> > > properties file, and everything else can be configured through > REST > > >> > > (enabling replications, changing topic filters, etc.). Because of > > >> this, > > >> > I'm > > >> > > leaning towards a full Connect REST API. To avoid issues with > > >> conflicts > > >> > > between the props file and the REST, we could document security > best > > >> > > practices (e.g. turn on basic auth or mTLS on the Connect REST to > > >> avoid > > >> > > unwanted interactions). > > >> > > > > >> > > 3. That is a good point, and I agree, a big plus for motivation. > > >> > > > > >> > > I have a working version of this in which all flows spin up a > > >> dedicated > > >> > > Connect REST, but I can give other solutions a try, too. > > >> > > > > >> > > Thanks, > > >> > > Daniel > > >> > > > > >> > > Chris Egerton <chr...@aiven.io.invalid> ezt írta (időpont: 2022. > > aug. > > >> > 24., > > >> > > Sze, 17:46): > > >> > > > > >> > > > Hi Daniel, > > >> > > > > > >> > > > I'd like to resurface this KIP in case you're still interested > in > > >> > > pursuing > > >> > > > it. I know it's been a while since you published it, and it > hasn't > > >> > > received > > >> > > > much attention, but I'm hoping we can give it a try now and > > finally > > >> put > > >> > > > this long-standing bug to rest. To that end, I have some > thoughts > > >> about > > >> > > the > > >> > > > proposal. This isn't a complete review, but I wanted to give > > enough > > >> to > > >> > > get > > >> > > > the ball rolling: > > >> > > > > > >> > > > 1. Some environments with firewalls or strict security policies > > may > > >> not > > >> > > be > > >> > > > able to bring up a REST server for each MM2 node. If we decide > > that > > >> > we'd > > >> > > > like to use the Connect REST API (or even just parts of it) to > > >> address > > >> > > this > > >> > > > bug with MM2, it does make sense to eventually make the > > >> availability of > > >> > > the > > >> > > > REST API a hard requirement for running MM2, but it might be a > bit > > >> too > > >> > > > abrupt to do that all in a single release. What do you think > about > > >> > making > > >> > > > the REST API optional for now, but noting that it will become > > >> required > > >> > > in a > > >> > > > later release (probably 4.0.0 or, if that's not enough time, > > >> 5.0.0)? We > > >> > > > could choose not to bring the REST server for any node whose > > >> > > configuration > > >> > > > doesn't explicitly opt into one, and maybe log a warning message > > on > > >> > > startup > > >> > > > if none is configured. In effect, we'd be marking the current > mode > > >> (no > > >> > > REST > > >> > > > server) as deprecated. > > >> > > > > > >> > > > 2. I'm not sure that we should count out the "Creating an > > >> internal-only > > >> > > > derivation of the Connect REST API" rejected alternative. Right > > now, > > >> > the > > >> > > > single source of truth for the configuration of a MM2 cluster > > >> (assuming > > >> > > > it's being run in dedicated mode, and not as a connector in a > > >> vanilla > > >> > > > Connect cluster) is the configuration file used for the process. > > By > > >> > > > bringing up the REST API, we'd expose endpoints to modify > > connector > > >> > > > configurations, which would not only add complexity to the > > operation > > >> > of a > > >> > > > MM2 cluster, but even qualify as an attack vector for malicious > > >> > entities. > > >> > > > Thanks to KIP-507 we have some amount of security around the > > >> > > internal-only > > >> > > > endpoints used by the Connect framework, but for any public > > >> endpoints, > > >> > > the > > >> > > > Connect REST API doesn't come with any security out of the box. > > >> > > > > > >> > > > 3. Small point, but with support for exactly-once source > > connectors > > >> > > coming > > >> > > > out in 3.3.0, it's also worth noting that that's another feature > > >> that > > >> > > won't > > >> > > > work properly with multi-node MM2 clusters without adding a REST > > >> server > > >> > > for > > >> > > > each node (or some substitute that accomplishes the same goal). > I > > >> don't > > >> > > > think this will affect the direction of the design discussion > too > > >> much, > > >> > > but > > >> > > > it does help strengthen the motivation. > > >> > > > > > >> > > > Cheers, > > >> > > > > > >> > > > Chris > > >> > > > > > >> > > > On 2021/02/18 15:57:36 Dániel Urbán wrote: > > >> > > > > Hello everyone, > > >> > > > > > > >> > > > > * Sorry, I meant KIP-710. > > >> > > > > > > >> > > > > Right now the MirrorMaker cluster is somewhat unreliable, and > > not > > >> > > > > supporting running in a cluster properly. I'd say that fixing > > this > > >> > > would > > >> > > > be > > >> > > > > a nice addition. > > >> > > > > Does anyone have some input on this? > > >> > > > > > > >> > > > > Thanks in advance > > >> > > > > Daniel > > >> > > > > > > >> > > > > Dániel Urbán <ur...@gmail.com> ezt írta (időpont: 2021. jan. > > >> 26., K, > > >> > > > > 15:56): > > >> > > > > > > >> > > > > > Hello everyone, > > >> > > > > > > > >> > > > > > I would like to start a discussion on KIP-709, which > addresses > > >> some > > >> > > > > > missing features in MM2 dedicated mode. > > >> > > > > > > > >> > > > > > > > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-710%3A+Full+support+for+distributed+mode+in+dedicated+MirrorMaker+2.0+clusters > > >> > > > > > > > >> > > > > > Currently, the dedicated mode of MM2 does not fully support > > >> running > > >> > > in > > >> > > > a > > >> > > > > > cluster. The core issue is that the Connect REST Server is > not > > >> > > included > > >> > > > in > > >> > > > > > the dedicated mode, which makes follower->leader > communication > > >> > > > impossible. > > >> > > > > > In some cases, this results in the cluster not being able to > > >> react > > >> > to > > >> > > > > > dynamic configuration changes (e.g. dynamic topic filter > > >> changes). > > >> > > > > > Another smaller detail is that MM2 dedicated mode eagerly > > >> resolves > > >> > > > config > > >> > > > > > provider references in the Connector configurations, which > is > > >> > > > undesirable > > >> > > > > > and a breaking change compared to vanilla Connect. This can > > >> cause > > >> > an > > >> > > > issue > > >> > > > > > for example when there is an environment variable reference, > > >> which > > >> > > > contains > > >> > > > > > some host-specific information, like a file path. The leader > > >> > resolves > > >> > > > the > > >> > > > > > reference eagerly, and the resolved value is propagated to > > other > > >> > MM2 > > >> > > > nodes > > >> > > > > > instead of the reference being resolved locally, separately > on > > >> each > > >> > > > node. > > >> > > > > > > > >> > > > > > The KIP addresses these by adding the Connect REST Server to > > the > > >> > MM2 > > >> > > > > > dedicated mode for each replication flow, and postponing the > > >> config > > >> > > > > > provider reference resolution. > > >> > > > > > > > >> > > > > > Please discuss, I know this is a major change, but also an > > >> > important > > >> > > > > > feature for MM2 users. > > >> > > > > > > > >> > > > > > Daniel > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > >