Hi Martijn, makes sense to me. For dropping a connector, I think, we need separate discussion for each of them and I would not block this effort on these discussions.
Cheers, Konstantin On Fri, Jan 14, 2022 at 10:26 AM Martijn Visser <mart...@ververica.com> wrote: > Hi User mailing list, > > I'm also forwarding this thread to you. Please let me know if you have any > comments or feedback! > > Best regards, > > Martijn > > ---------- Forwarded message --------- > From: Martijn Visser <mart...@ververica.com> > Date: Fri, 14 Jan 2022 at 06:28 > Subject: Re: [DISCUSS] Moving connectors from Flink to external connector > repositories > To: Qingsheng Ren <renqs...@gmail.com> > Cc: dev <d...@flink.apache.org> > > > Hi everyone, > > If you have any more comments or questions, please let me know. Else I > would open up a vote on this thread in the next couple of days. > > Best regards, > > Martijn > > On Thu, 6 Jan 2022 at 09:45, Qingsheng Ren <renqs...@gmail.com> wrote: > >> Thanks Martijn for driving this! >> >> I’m +1 for Martijn’s proposal. It’s important to avoid making some >> connectors above others, and all connectors should share the same quality >> standard. Keeping some basic connectors like FileSystem is reasonable since >> it’s crucial for new users to try and explore Flink quickly. >> >> Another point I’d like to mention is that we need to add more E2E cases >> using basic connectors in Flink main repo after we moving connectors out. >> Currently E2E tests are heavily dependent on connectors. It’s essential to >> keep the coverage and quality of Flink main repo even without these >> connector’s E2E cases. >> >> Best regards, >> >> Qingsheng Ren >> >> >> > On Jan 5, 2022, at 9:59 PM, Martijn Visser <mart...@ververica.com> >> wrote: >> > >> > Hi everyone, >> > >> > As already mentioned in the previous discussion thread [1] I'm opening >> up a >> > parallel discussion thread on moving connectors from Flink to external >> > connector repositories. If you haven't read up on this discussion >> before, I >> > recommend reading that one first. >> > >> > The goal with the external connector repositories is to make it easier >> to >> > develop and release connectors by not being bound to the release cycle >> of >> > Flink itself. It should result in faster connector releases, a more >> active >> > connector community and a reduced build time for Flink. >> > >> > We currently have the following connectors available in Flink itself: >> > >> > * Kafka -> For DataStream & Table/SQL users >> > * Upsert-Kafka -> For Table/SQL users >> > * Cassandra -> For DataStream users >> > * Elasticsearch -> For DataStream & Table/SQL users >> > * Kinesis -> For DataStream users & Table/SQL users >> > * RabbitMQ -> For DataStream users >> > * Google Cloud PubSub -> For DataStream users >> > * Hybrid Source -> For DataStream users >> > * NiFi -> For DataStream users >> > * Pulsar -> For DataStream users >> > * Twitter -> For DataStream users >> > * JDBC -> For DataStream & Table/SQL users >> > * FileSystem -> For DataStream & Table/SQL users >> > * HBase -> For DataStream & Table/SQL users >> > * DataGen -> For Table/SQL users >> > * Print -> For Table/SQL users >> > * BlackHole -> For Table/SQL users >> > * Hive -> For Table/SQL users >> > >> > I would propose to move out all connectors except Hybrid Source, >> > FileSystem, DataGen, Print and BlackHole because: >> > >> > * We should avoid at all costs that certain connectors are considered as >> > 'Core' connectors. If that happens, it creates a perception that there >> are >> > first-grade/high-quality connectors because they are in 'Core' Flink and >> > second-grade/lesser-quality connectors because they are outside of the >> > Flink codebase. It directly hurts the goal, because these connectors are >> > still bound to the release cycle of Flink. Last but not least, it risks >> any >> > success of external connector repositories since every connector >> > contributor would still want to be in 'Core' Flink. >> > * To continue on the quality of connectors, we should aim that all >> > connectors are of high quality. That means that we shouldn't have a >> > connector that's only available for either DataStream or Table/SQL >> users, >> > but for both. It also means that (if applicable) the connector should >> > support all options, like bounded and unbounded scan, lookup, batch and >> > streaming sink capabilities. In the end the quality should depend on the >> > maintainers of the connector, not on where the code is maintained. >> > * The Hybrid Source connector is a special connector because of its >> > purpose. >> > * The FileSystem, DataGen, Print and BlackHole connectors are important >> for >> > first time Flink users/testers. If you want to experiment with Flink, >> you >> > will most likely start with a local file before moving to one of the >> other >> > sources or sinks. These 4 connectors can help with either >> reading/writing >> > local files or generating/displaying/ignoring data. >> > * Some of the connectors haven't been maintained in a long time (for >> > example, NiFi and Google Cloud PubSub). An argument could be made that >> we >> > check if we actually want to move such a connector or make the decision >> to >> > drop the connector entirely. >> > >> > I'm looking forward to your thoughts! >> > >> > Best regards, >> > >> > Martijn Visser | Product Manager >> > >> > mart...@ververica.com >> > >> > [1] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm >> > >> > <https://www.ververica.com/> >> > >> > >> > Follow us @VervericaData >> > >> > -- >> > >> > Join Flink Forward <https://flink-forward.org/> - The Apache Flink >> > Conference >> > >> > Stream Processing | Event Driven | Real Time >> >> -- Konstantin Knauf https://twitter.com/snntrable https://github.com/knaufk