Hi Enrico,

Thank you for working on this!

But as I mentioned in the pull request, we should avoid using a
one-connector-per-schema model. That model probably works with other
connectors that have a very limited number of schemas. If you are going to
implement a schema-aware Kafka connector, that model is impossible to
maintain, because it will introduce N * N connectors where N is the number
of supported schemas.

We should maintain one "bytes" connector and transfer the Kafka schema to
the Pulsar schema. I have written an enhanced Kafka connector
<https://github.com/streamnative/pulsar-io-kafka> two years ago.

You just need to maintain one connector:
https://github.com/streamnative/pulsar-io-kafka/blob/master/src/main/java/io/streamnative/connectors/kafka/KafkaSource.java#L94
Then convert Kafka SerDe to Pulsar schema:
https://github.com/streamnative/pulsar-io-kafka/blob/master/src/main/java/io/streamnative/connectors/kafka/KafkaSource.java#L338

I am happy to submit a PR to merge those changes back.

- Sijie

On Thu, Feb 11, 2021 at 11:48 PM Enrico Olivelli <eolive...@gmail.com>
wrote:

> Hello everyone,
> here in our Pulsar repository we have a simple Kafka Connector for Pulsar
> IO composed by a Sink and a Source.
> https://github.com/apache/pulsar/tree/master/pulsar-io/kafka
>
> I have started to work on a set of enhancements to this connector in order
> to make it more powerful and to better fit the needs of enterprise users.
>
> The first patch I have submitted is about supporting Avro encoded messages
> + Confluent Schema Registry in the KafkaSource
> https://github.com/apache/pulsar/pull/9448
>
> The patch is only the first one of a bigger work that we have to do in
> order to have a fully usable Connector for non-trivial use cases.
>
> I will be happy to follow up with other patches and especially to draw a
> little roadmap about the features that we want to implement and provide to
> the community.
>
> Please take a look to the patch and share your thoughts
>
> Regards
> Enrico
>

Reply via email to