Thanks Chris for the discussion.

Since nobody else expressed interest in this KIP, I'm going to discard it.

On Wed, Mar 8, 2023 at 8:41 PM Chris Egerton <chr...@aiven.io.invalid> wrote:
>
> Hi Mickael,
>
> I do sympathize with the desire for a "quick fix". I think your point about
> these being problematic to support sums up my hesitation here pretty well,
> both with respect to the potential footgun of unintended rebalances (should
> users try to do more with this field than we expect), and with making other
> similar improvements (i.e., expanded metadata support in Kafka Connect)
> more difficult to design correctly for us and to use effectively for users
> (especially if we were to deprecate/remove the description field at a later
> date).
>
> It's also worth noting that users can already add a "description" field to
> the JSON config for any connector they create; the benefit provided by this
> KIP is that they're automatically prompted to do so if they're using a
> UI/CLI/etc. that builds on top of the various /connector-plugins endpoints
> to find the set of configuration properties that a user can/should populate
> before creating a connector.
>
> I'd welcome thoughts from you and others on whether the tradeoffs here are
> worth it; right now I'm still not sure about this one.
>
> Cheers,
>
> Chris
>
> On Tue, Mar 7, 2023 at 6:42 AM Mickael Maison <mickael.mai...@gmail.com>
> wrote:
>
> > H Chris,
> >
> > Thanks for taking a look.
> >
> > 1. Yes updating the description can potentially trigger a rebalance. I
> > don't expect users to frequently update the description so I thought
> > this was acceptable. I've added a note to the KIP to mention it.
> >
> > 2. The tags model you described could be interesting but it looks
> > pretty involved with multiple new endpoints and brand new concepts.
> >
> > With this KIP, I really took the most basic approach and proposed the
> > simplest set of changes that could get in the next release and, I
> > think, immediately bring benefits. However sometimes this is not the
> > best approach as a "quick fix" could end up problematic to support in
> > the future. The drawbacks (new connector config field + causing a
> > rebalance on update) look relatively benign to me so I thought this
> > could be an acceptable proposal.
> >
> > Thanks,
> > Mickael
> >
> >
> >
> > On Thu, Feb 23, 2023 at 2:45 PM Chris Egerton <chr...@aiven.io.invalid>
> > wrote:
> > >
> > > Actually, I misspoke--a rebalance isn't triggered when an existing
> > > connector's config is updated. Assuming the set of workers remains
> > stable,
> > > a rebalance is only necessary when a new connector is created, an
> > existing
> > > one is deleted, or a new set of task configs is generated.
> > >
> > > This weakens the point about unnecessary rebalances when a connector's
> > > description is updated, but doesn't entirely address it. Spurious
> > > rebalances may still be triggered if a new set of task configs is
> > > generated, which for reasons outlined above, is fairly likely.
> > >
> > > On Thu, Feb 23, 2023 at 7:41 AM Chris Egerton <chr...@aiven.io> wrote:
> > >
> > > > Hi Mickael,
> > > >
> > > > Thanks for the KIP!
> > > >
> > > > While it's tempting to add this field to the out-of-the-box connector
> > > > config def, I'm a little hesitant, for two reasons.
> > > >
> > > > 1. Adding this directly to the connector config could have unintended
> > > > consequences on the actual data processing by the connector. Any time a
> > > > connector's config is modified, the Connector object running for it is
> > > > restarted with that new config. In most cases this is a trivial
> > operation
> > > > since we have incremental rebalancing enabled by default, the
> > connector can
> > > > (and probably should!) generate task configs that are functionally
> > > > identical to the ones it last generated, and most (though not all)
> > > > Connector classes are fairly lightweight and leave the real work to
> > their
> > > > Task class. However, due to KAFKA-9228 [1], it's not just common
> > practice
> > > > for connectors to perform transparent passthrough of most of their
> > configs
> > > > when generating task configs, it's actually necessary to work around a
> > bug
> > > > in the runtime. As a result, tweaking the description of a connector
> > would
> > > > be fairly likely to result in a full restart of all of its tasks, in
> > > > addition to triggering two rebalances (which may not be so lightweight
> > if
> > > > users are still running with eager rebalancing... which, sadly, I've
> > heard
> > > > is still happening today).
> > > >
> > > > 2. The motivation section mentions some information that might go into
> > the
> > > > description field, such as the team that owns the connector and
> > emergency
> > > > contact info. It seems like this info might benefit from a little more
> > > > structure if we're trying to design for programmatic access by GUIs and
> > > > CLIs (which I'm assuming is the case, since I can't imagine a human
> > being
> > > > getting much use out of the raw output from the GET
> > > > /connector-plugins/{name}/config and PUT
> > > > /connector-plugins/{name}/config/validate endpoints). This might also
> > make
> > > > it easier to add custom validation logic around what kind of
> > information is
> > > > present via REST extension.
> > > >
> > > >
> > > > With these thoughts in mind, what do you think about adding a new
> > generic
> > > > "tags" object to connectors that can support arbitrary user-provided
> > > > key/value pairs? If using the POST /connectors endpoint, it might look
> > > > something like this:
> > > >
> > > > {
> > > >   "name": "my-connector",
> > > >   "config": {
> > > >     "connector.class": "MirrorSource",
> > > >     "tasks.max": "908"
> > > >   },
> > > >   "tags": {
> > > >     "team": "data-infra",
> > > >     "phone": "12345678901",
> > > >     "email": "din...@example.org"
> > > >   }
> > > > }
> > > >
> > > > And, to allow users to modify connector tags after one has been
> > created,
> > > > we might introduce a /connectors/{name}/tags endpoint with
> > PUT/PATCH/DELETE
> > > > verbs that writes tag info for a connector to the config topic, but
> > without
> > > > altering the actual connector config (allowing us to skip a rebalance
> > > > altogether).
> > > >
> > > > One other thing we could consider is allowing cluster administrators to
> > > > require that connectors are created with a certain set of tags, or
> > even add
> > > > per-tag regex validation. They might specify something like
> > > > "connector.required.tags = team, phone, email" or
> > > > "connector.tags.phone.regex = [0-9]{11}" in a worker config file. But
> > this
> > > > is probably overboard for now, especially since it's already possible
> > to
> > > > accomplish via a REST extension.
> > > >
> > > > Finally, I'm also wondering if, in pursuit of expanding Connect's
> > > > out-of-the-box support for dealing with connector metadata, we might
> > want
> > > > to expose created/last-updated times for connector configurations. We
> > > > definitely don't have to do that in this KIP but if you agree that this
> > > > would be useful, we should probably keep it in mind to avoid painting
> > > > ourselves into a corner. This is why I'm thinking of using "tags"
> > instead
> > > > of something a little more generic like "metadata", BTW.
> > > >
> > > > [1] -
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-908%3A+Add+description+field+to+connector+configuration
> > > >
> > > > Cheers,
> > > >
> > > > Chris
> > > >
> > > > On Thu, Feb 23, 2023 at 4:52 AM Mickael Maison <
> > mickael.mai...@gmail.com>
> > > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> I created a very small KIP to add a description field to connectors:
> > > >>
> > > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-908%3A+Add+description+field+to+connector+configuration
> > > >>
> > > >> Let me know if you have any feedback.
> > > >>
> > > >> Thanks,
> > > >> Mickael
> > > >>
> > > >
> >

Reply via email to