Hi Enrico, Thank you for working on this!
But as I mentioned in the pull request, we should avoid using a one-connector-per-schema model. That model probably works with other connectors that have a very limited number of schemas. If you are going to implement a schema-aware Kafka connector, that model is impossible to maintain, because it will introduce N * N connectors where N is the number of supported schemas. We should maintain one "bytes" connector and transfer the Kafka schema to the Pulsar schema. I have written an enhanced Kafka connector <https://github.com/streamnative/pulsar-io-kafka> two years ago. You just need to maintain one connector: https://github.com/streamnative/pulsar-io-kafka/blob/master/src/main/java/io/streamnative/connectors/kafka/KafkaSource.java#L94 Then convert Kafka SerDe to Pulsar schema: https://github.com/streamnative/pulsar-io-kafka/blob/master/src/main/java/io/streamnative/connectors/kafka/KafkaSource.java#L338 I am happy to submit a PR to merge those changes back. - Sijie On Thu, Feb 11, 2021 at 11:48 PM Enrico Olivelli <eolive...@gmail.com> wrote: > Hello everyone, > here in our Pulsar repository we have a simple Kafka Connector for Pulsar > IO composed by a Sink and a Source. > https://github.com/apache/pulsar/tree/master/pulsar-io/kafka > > I have started to work on a set of enhancements to this connector in order > to make it more powerful and to better fit the needs of enterprise users. > > The first patch I have submitted is about supporting Avro encoded messages > + Confluent Schema Registry in the KafkaSource > https://github.com/apache/pulsar/pull/9448 > > The patch is only the first one of a bigger work that we have to do in > order to have a fully usable Connector for non-trivial use cases. > > I will be happy to follow up with other patches and especially to draw a > little roadmap about the features that we want to implement and provide to > the community. > > Please take a look to the patch and share your thoughts > > Regards > Enrico >