[ https://issues.apache.org/jira/browse/KAFKA-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Whalen resolved KAFKA-8177. -------------------------------- Resolution: Resolved > Allow for separate connect instances to have sink connectors with the same > name > ------------------------------------------------------------------------------- > > Key: KAFKA-8177 > URL: https://issues.apache.org/jira/browse/KAFKA-8177 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect > Reporter: Paul Whalen > Priority: Minor > Labels: connect > > If you have multiple Connect instances (either a single standalone or > distributed group of workers) running against the same Kafka cluster, the > connect instances cannot each have a sink connector with the same name and > still operate independently. This is because the consumer group ID used > internally for reading from the source topic(s) is entirely derived from the > connector's name: > [https://github.com/apache/kafka/blob/d0e436c471ba4122ddcc0f7a1624546f97c4a517/connect/runtime/src/main/java/org/apache/kafka/connect/util/SinkUtils.java#L24] > The documentation of Connect implies to me that it supports "multi-tenancy," > that is, as long as... > * In standalone mode, the {{offset.storage.file.filename}} is not shared > between instances > * In distributed mode, {{group.id}} and {{config.storage.topic}}, > {{offset.storage.topic}}, and {{status.storage.topic}} are not the same > between instances > ... then the connect instances can operate completely independently without > fear of conflict. But the sink connector consumer group naming policy makes > this untrue. Obviously this can be achieved by uniquely naming connectors > across instances, but in some environments that could be a bit of a nuisance, > or a challenging policy to enforce. For instance, imagine a large group of > developers or data analysts all running their own standalone Connect to load > into a SQL database for their own analysis, or replicating to mirroring to > their own local cluster for testing. > The obvious solution is allow supplying config that gives a Connect instance > some notion of identity, and to use that when creating the sink task consumer > group. Distributed mode already has this obviously ({{group.id}}), but it > would need to be added for standalone mode. Maybe {{instance.id}}? Given that > solution it seems like this would need a small KIP. > I could also imagine this solving this problem through better documentation > ("ensure your connector names are unique!"), but having that subtlety doesn't > seem worth it to me. (Optionally) assigning identity to every Connect > instance seems strictly more clear, without any downside. -- This message was sent by Atlassian Jira (v8.3.4#803005)