Fwd: [DISCUSS] Moving connectors from Flink to external connector repositories

Martijn Visser Fri, 14 Jan 2022 01:26:03 -0800

Hi User mailing list,

I'm also forwarding this thread to you. Please let me know if you have any
comments or feedback!


Best regards,

Martijn

---------- Forwarded message ---------
From: Martijn Visser <mart...@ververica.com>
Date: Fri, 14 Jan 2022 at 06:28
Subject: Re: [DISCUSS] Moving connectors from Flink to external connector
repositories
To: Qingsheng Ren <renqs...@gmail.com>
Cc: dev <d...@flink.apache.org>


Hi everyone,

If you have any more comments or questions, please let me know. Else I
would open up a vote on this thread in the next couple of days.

Best regards,

Martijn

On Thu, 6 Jan 2022 at 09:45, Qingsheng Ren <renqs...@gmail.com> wrote:

> Thanks Martijn for driving this!
>
> I’m +1 for Martijn’s proposal. It’s important to avoid making some
> connectors above others, and all connectors should share the same quality
> standard. Keeping some basic connectors like FileSystem is reasonable since
> it’s crucial for new users to try and explore Flink quickly.
>
> Another point I’d like to mention is that we need to add more E2E cases
> using basic connectors in Flink main repo after we moving connectors out.
> Currently E2E tests are heavily dependent on connectors. It’s essential to
> keep the coverage and quality of Flink main repo even without these
> connector’s E2E cases.
>
> Best regards,
>
> Qingsheng Ren
>
>
> > On Jan 5, 2022, at 9:59 PM, Martijn Visser <mart...@ververica.com>
> wrote:
> >
> > Hi everyone,
> >
> > As already mentioned in the previous discussion thread [1] I'm opening
> up a
> > parallel discussion thread on moving connectors from Flink to external
> > connector repositories. If you haven't read up on this discussion
> before, I
> > recommend reading that one first.
> >
> > The goal with the external connector repositories is to make it easier to
> > develop and release connectors by not being bound to the release cycle of
> > Flink itself. It should result in faster connector releases, a more
> active
> > connector community and a reduced build time for Flink.
> >
> > We currently have the following connectors available in Flink itself:
> >
> > * Kafka -> For DataStream & Table/SQL users
> > * Upsert-Kafka -> For Table/SQL users
> > * Cassandra -> For DataStream users
> > * Elasticsearch -> For DataStream & Table/SQL users
> > * Kinesis -> For DataStream users & Table/SQL users
> > * RabbitMQ -> For DataStream users
> > * Google Cloud PubSub -> For DataStream users
> > * Hybrid Source -> For DataStream users
> > * NiFi -> For DataStream users
> > * Pulsar -> For DataStream users
> > * Twitter -> For DataStream users
> > * JDBC -> For DataStream & Table/SQL users
> > * FileSystem -> For DataStream & Table/SQL users
> > * HBase -> For DataStream & Table/SQL users
> > * DataGen -> For Table/SQL users
> > * Print -> For Table/SQL users
> > * BlackHole -> For Table/SQL users
> > * Hive -> For Table/SQL users
> >
> > I would propose to move out all connectors except Hybrid Source,
> > FileSystem, DataGen, Print and BlackHole because:
> >
> > * We should avoid at all costs that certain connectors are considered as
> > 'Core' connectors. If that happens, it creates a perception that there
> are
> > first-grade/high-quality connectors because they are in 'Core' Flink and
> > second-grade/lesser-quality connectors because they are outside of the
> > Flink codebase. It directly hurts the goal, because these connectors are
> > still bound to the release cycle of Flink. Last but not least, it risks
> any
> > success of external connector repositories since every connector
> > contributor would still want to be in 'Core' Flink.
> > * To continue on the quality of connectors, we should aim that all
> > connectors are of high quality. That means that we shouldn't have a
> > connector that's only available for either DataStream or Table/SQL users,
> > but for both. It also means that (if applicable) the connector should
> > support all options, like bounded and unbounded scan, lookup, batch and
> > streaming sink capabilities. In the end the quality should depend on the
> > maintainers of the connector, not on where the code is maintained.
> > * The Hybrid Source connector is a special connector because of its
> > purpose.
> > * The FileSystem, DataGen, Print and BlackHole connectors are important
> for
> > first time Flink users/testers. If you want to experiment with Flink, you
> > will most likely start with a local file before moving to one of the
> other
> > sources or sinks. These 4 connectors can help with either reading/writing
> > local files or generating/displaying/ignoring data.
> > * Some of the connectors haven't been maintained in a long time (for
> > example, NiFi and Google Cloud PubSub). An argument could be made that we
> > check if we actually want to move such a connector or make the decision
> to
> > drop the connector entirely.
> >
> > I'm looking forward to your thoughts!
> >
> > Best regards,
> >
> > Martijn Visser | Product Manager
> >
> > mart...@ververica.com
> >
> > [1] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm
> >
> > <https://www.ververica.com/>
> >
> >
> > Follow us @VervericaData
> >
> > --
> >
> > Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> > Conference
> >
> > Stream Processing | Event Driven | Real Time
>
>

Fwd: [DISCUSS] Moving connectors from Flink to external connector repositories

Reply via email to