Re: [DISCUSS] Moving connectors from Flink to external connector repositories

Konstantin Knauf Fri, 14 Jan 2022 05:12:50 -0800

Hi Martijn,

makes sense to me. For dropping a connector, I think, we need separate
discussion for each of them and I would not block this effort on these
discussions.


Cheers,

Konstantin

On Fri, Jan 14, 2022 at 10:26 AM Martijn Visser <mart...@ververica.com>
wrote:

> Hi User mailing list,
>
> I'm also forwarding this thread to you. Please let me know if you have any
> comments or feedback!
>
> Best regards,
>
> Martijn
>
> ---------- Forwarded message ---------
> From: Martijn Visser <mart...@ververica.com>
> Date: Fri, 14 Jan 2022 at 06:28
> Subject: Re: [DISCUSS] Moving connectors from Flink to external connector
> repositories
> To: Qingsheng Ren <renqs...@gmail.com>
> Cc: dev <d...@flink.apache.org>
>
>
> Hi everyone,
>
> If you have any more comments or questions, please let me know. Else I
> would open up a vote on this thread in the next couple of days.
>
> Best regards,
>
> Martijn
>
> On Thu, 6 Jan 2022 at 09:45, Qingsheng Ren <renqs...@gmail.com> wrote:
>
>> Thanks Martijn for driving this!
>>
>> I’m +1 for Martijn’s proposal. It’s important to avoid making some
>> connectors above others, and all connectors should share the same quality
>> standard. Keeping some basic connectors like FileSystem is reasonable since
>> it’s crucial for new users to try and explore Flink quickly.
>>
>> Another point I’d like to mention is that we need to add more E2E cases
>> using basic connectors in Flink main repo after we moving connectors out.
>> Currently E2E tests are heavily dependent on connectors. It’s essential to
>> keep the coverage and quality of Flink main repo even without these
>> connector’s E2E cases.
>>
>> Best regards,
>>
>> Qingsheng Ren
>>
>>
>> > On Jan 5, 2022, at 9:59 PM, Martijn Visser <mart...@ververica.com>
>> wrote:
>> >
>> > Hi everyone,
>> >
>> > As already mentioned in the previous discussion thread [1] I'm opening
>> up a
>> > parallel discussion thread on moving connectors from Flink to external
>> > connector repositories. If you haven't read up on this discussion
>> before, I
>> > recommend reading that one first.
>> >
>> > The goal with the external connector repositories is to make it easier
>> to
>> > develop and release connectors by not being bound to the release cycle
>> of
>> > Flink itself. It should result in faster connector releases, a more
>> active
>> > connector community and a reduced build time for Flink.
>> >
>> > We currently have the following connectors available in Flink itself:
>> >
>> > * Kafka -> For DataStream & Table/SQL users
>> > * Upsert-Kafka -> For Table/SQL users
>> > * Cassandra -> For DataStream users
>> > * Elasticsearch -> For DataStream & Table/SQL users
>> > * Kinesis -> For DataStream users & Table/SQL users
>> > * RabbitMQ -> For DataStream users
>> > * Google Cloud PubSub -> For DataStream users
>> > * Hybrid Source -> For DataStream users
>> > * NiFi -> For DataStream users
>> > * Pulsar -> For DataStream users
>> > * Twitter -> For DataStream users
>> > * JDBC -> For DataStream & Table/SQL users
>> > * FileSystem -> For DataStream & Table/SQL users
>> > * HBase -> For DataStream & Table/SQL users
>> > * DataGen -> For Table/SQL users
>> > * Print -> For Table/SQL users
>> > * BlackHole -> For Table/SQL users
>> > * Hive -> For Table/SQL users
>> >
>> > I would propose to move out all connectors except Hybrid Source,
>> > FileSystem, DataGen, Print and BlackHole because:
>> >
>> > * We should avoid at all costs that certain connectors are considered as
>> > 'Core' connectors. If that happens, it creates a perception that there
>> are
>> > first-grade/high-quality connectors because they are in 'Core' Flink and
>> > second-grade/lesser-quality connectors because they are outside of the
>> > Flink codebase. It directly hurts the goal, because these connectors are
>> > still bound to the release cycle of Flink. Last but not least, it risks
>> any
>> > success of external connector repositories since every connector
>> > contributor would still want to be in 'Core' Flink.
>> > * To continue on the quality of connectors, we should aim that all
>> > connectors are of high quality. That means that we shouldn't have a
>> > connector that's only available for either DataStream or Table/SQL
>> users,
>> > but for both. It also means that (if applicable) the connector should
>> > support all options, like bounded and unbounded scan, lookup, batch and
>> > streaming sink capabilities. In the end the quality should depend on the
>> > maintainers of the connector, not on where the code is maintained.
>> > * The Hybrid Source connector is a special connector because of its
>> > purpose.
>> > * The FileSystem, DataGen, Print and BlackHole connectors are important
>> for
>> > first time Flink users/testers. If you want to experiment with Flink,
>> you
>> > will most likely start with a local file before moving to one of the
>> other
>> > sources or sinks. These 4 connectors can help with either
>> reading/writing
>> > local files or generating/displaying/ignoring data.
>> > * Some of the connectors haven't been maintained in a long time (for
>> > example, NiFi and Google Cloud PubSub). An argument could be made that
>> we
>> > check if we actually want to move such a connector or make the decision
>> to
>> > drop the connector entirely.
>> >
>> > I'm looking forward to your thoughts!
>> >
>> > Best regards,
>> >
>> > Martijn Visser | Product Manager
>> >
>> > mart...@ververica.com
>> >
>> > [1] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm
>> >
>> > <https://www.ververica.com/>
>> >
>> >
>> > Follow us @VervericaData
>> >
>> > --
>> >
>> > Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> > Conference
>> >
>> > Stream Processing | Event Driven | Real Time
>>
>>

-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk

Re: [DISCUSS] Moving connectors from Flink to external connector repositories

Reply via email to