Hi Mickael, Thanks for the KIP!
While it's tempting to add this field to the out-of-the-box connector config def, I'm a little hesitant, for two reasons. 1. Adding this directly to the connector config could have unintended consequences on the actual data processing by the connector. Any time a connector's config is modified, the Connector object running for it is restarted with that new config. In most cases this is a trivial operation since we have incremental rebalancing enabled by default, the connector can (and probably should!) generate task configs that are functionally identical to the ones it last generated, and most (though not all) Connector classes are fairly lightweight and leave the real work to their Task class. However, due to KAFKA-9228 [1], it's not just common practice for connectors to perform transparent passthrough of most of their configs when generating task configs, it's actually necessary to work around a bug in the runtime. As a result, tweaking the description of a connector would be fairly likely to result in a full restart of all of its tasks, in addition to triggering two rebalances (which may not be so lightweight if users are still running with eager rebalancing... which, sadly, I've heard is still happening today). 2. The motivation section mentions some information that might go into the description field, such as the team that owns the connector and emergency contact info. It seems like this info might benefit from a little more structure if we're trying to design for programmatic access by GUIs and CLIs (which I'm assuming is the case, since I can't imagine a human being getting much use out of the raw output from the GET /connector-plugins/{name}/config and PUT /connector-plugins/{name}/config/validate endpoints). This might also make it easier to add custom validation logic around what kind of information is present via REST extension. With these thoughts in mind, what do you think about adding a new generic "tags" object to connectors that can support arbitrary user-provided key/value pairs? If using the POST /connectors endpoint, it might look something like this: { "name": "my-connector", "config": { "connector.class": "MirrorSource", "tasks.max": "908" }, "tags": { "team": "data-infra", "phone": "12345678901", "email": "din...@example.org" } } And, to allow users to modify connector tags after one has been created, we might introduce a /connectors/{name}/tags endpoint with PUT/PATCH/DELETE verbs that writes tag info for a connector to the config topic, but without altering the actual connector config (allowing us to skip a rebalance altogether). One other thing we could consider is allowing cluster administrators to require that connectors are created with a certain set of tags, or even add per-tag regex validation. They might specify something like "connector.required.tags = team, phone, email" or "connector.tags.phone.regex = [0-9]{11}" in a worker config file. But this is probably overboard for now, especially since it's already possible to accomplish via a REST extension. Finally, I'm also wondering if, in pursuit of expanding Connect's out-of-the-box support for dealing with connector metadata, we might want to expose created/last-updated times for connector configurations. We definitely don't have to do that in this KIP but if you agree that this would be useful, we should probably keep it in mind to avoid painting ourselves into a corner. This is why I'm thinking of using "tags" instead of something a little more generic like "metadata", BTW. [1] - https://cwiki.apache.org/confluence/display/KAFKA/KIP-908%3A+Add+description+field+to+connector+configuration Cheers, Chris On Thu, Feb 23, 2023 at 4:52 AM Mickael Maison <mickael.mai...@gmail.com> wrote: > Hi, > > I created a very small KIP to add a description field to connectors: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-908%3A+Add+description+field+to+connector+configuration > > Let me know if you have any feedback. > > Thanks, > Mickael >