Hi community, We have been working on a Multi Cluster Kafka Source and are looking to contribute it upstream. I've given a talk about the features and design at a Flink meetup: https://youtu.be/H1SYOuLcUTI.
The main features that it provides is: 1. Reading multiple Kafka clusters within a single source. 2. Adjusting the clusters and topics the source consumes from dynamically, without Flink job restart. Some of the challenging use cases that these features solve are: 1. Transparent Kafka cluster migration without Flink job restart. 2. Transparent Kafka topic migration without Flink job restart. 3. Direct integration with Hybrid Source. In addition, this is designed with wrapping and managing the existing KafkaSource components to enable these features, so it can continue to benefit from KafkaSource improvements and bug fixes. It can be considered as a form of a composite source. I think the contribution of this source could benefit a lot of users who have asked in the mailing list about Flink handling Kafka migrations and removing topics in the past. I would love to hear and address your thoughts and feedback, and if possible drive a FLIP! Best, Mason