+1 for deprecating SourceFunction from me as well. And a big THANK YOU to Alex Fedulov for bringing forward FLIP-238.
David On Fri, Jun 10, 2022 at 4:03 AM Lijie Wang <wangdachui9...@gmail.com> wrote: > Hi all, > > Sorry for my mistake. The `StreamExecutionEnvironment#readFiles` and can be > easily replaced by `FileSource#forRecordStreamFormat/forBulkFileFormat`. I > have no other concerns. > > +1 to deprecate SourceFunction and deprecate the methods (in > StreamExecutionEnvironment) based on SourceFunction . > > Best, > Lijie > > Konstantin Knauf <kna...@apache.org> 于2022年6月10日周五 05:11写道: > > > Hi everyone, > > > > thank you Jing for redirecting the discussion back to the topic at hand. > I > > agree with all of your points. > > > > +1 to deprecate SourceFunction > > > > Is there really no replacement for the > StreamExecutionEnvironment#readXXX. > > There is already a FLIP-27 based FileSource, right? What's missing to > > recommend using that as opposed to the the readXXX methods? > > > > Cheers, > > > > Konstantin > > > > Am Do., 9. Juni 2022 um 20:11 Uhr schrieb Alexander Fedulov < > > alexan...@ververica.com>: > > > > > Hi all, > > > > > > It seems that there is some understandable cautiousness with regard to > > > deprecating methods and subclasses that do not have alternatives just > > yet. > > > > > > We should probably first agree if it is in general OK for Flink to use > > > @Deprecated > > > annotation for parts of the code that do not have alternatives. In that > > > case, > > > we could add a comment along the lines of: > > > "This implementation is based on a deprecated SourceFunction API that > > > will gradually be phased out from Flink. No direct substitute exists at > > the > > > moment. > > > If you want to have a more future-proof solution, consider helping the > > > project by > > > contributing an implementation based on the new Source API." > > > > > > This should clearly communicate the message that usage of these > > > methods/classes > > > is discouraged and at the same time promote contributions for > addressing > > > the gap. > > > What do you think? > > > > > > Best, > > > Alexander Fedulov > > > > > > > > > On Thu, Jun 9, 2022 at 6:27 PM Ingo Bürk <airbla...@apache.org> wrote: > > > > > > > Hi, > > > > > > > > these APIs don't expose the underlying source directly, so I don't > > think > > > > we need to worry about deprecating them as well. There's also nothing > > > > inherently wrong with using a deprecated API internally, though even > > > > just for the experience of using our own new APIs I would personally > > say > > > > that they should be migrated to the new Source API. It's hard to > reason > > > > that users must migrate to a new API if we don't do it internally as > > > well. > > > > > > > > > > > > Best > > > > Ingo > > > > > > > > On 09.06.22 15:41, Lijie Wang wrote: > > > > > Hi Martijn, > > > > > > > > > > I don't mean it's a blocker. Just a information. And I'm also +1 > for > > > > this. > > > > > > > > > > Put it another way: should we migrate the `#readFile(...)` to new > API > > > or > > > > > provide a similar method "readxxx“ based on the new Source API? > > > > > > > > > > And if we don't migrate it, does it mean that the `#readFile(...)` > > > should > > > > > also be marked as deprecated? > > > > > > > > > > Best, > > > > > Lijie > > > > > > > > > > Martijn Visser <martijnvis...@apache.org> 于2022年6月9日周四 21:03写道: > > > > > > > > > >> Hi Lijie, > > > > >> > > > > >> I don't see any problem with deprecating those methods at this > > moment, > > > > as > > > > >> long as we don't remove them until the replacements are available. > > > > Besides > > > > >> that, are we sure there are no replacements already, especially > with > > > the > > > > >> new FileSource? > > > > >> > > > > >> Best regards, > > > > >> > > > > >> Martijn > > > > >> > > > > >> Op do 9 jun. 2022 om 14:23 schreef Lijie Wang < > > > wangdachui9...@gmail.com > > > > >: > > > > >> > > > > >>> Hi all, > > > > >>> > > > > >>> FYI, currently, some commonly used methods in > > > > StreamExecutionEnvironment > > > > >>> are still based on the old SourceFunction (and there is no > > > > alternative): > > > > >>> `StreamExecutionEnvironment#readFile(...)` > > > > >>> `StreamExecutionEnvironment#readTextFile(...)` > > > > >>> > > > > >>> I think these should be migrated to the new source API before > > > deprecate > > > > >> the > > > > >>> SourceFunction. > > > > >>> > > > > >>> Best, > > > > >>> Lijie > > > > >>> > > > > >>> Martijn Visser <martijnvis...@apache.org> 于2022年6月9日周四 16:05写道: > > > > >>> > > > > >>>> Hi all, > > > > >>>> > > > > >>>> I think implicitly we've already considered the SourceFunction > and > > > > >>>> SinkFunction as deprecated. They are even marked as so on the > > Flink > > > > >>> roadmap > > > > >>>> [1]. That also shows that connectors that are using these > > interfaces > > > > >> are > > > > >>>> either approaching end-of-life. The fact that we're actively > > > migrating > > > > >>>> connectors from Source/SinkFunction to FLIP-27/FLIP-143 (plus > > add-on > > > > >>> FLIPs) > > > > >>>> shows that we've already determined that target. > > > > >>>> > > > > >>>> With regards to the motivation of FLIP-27, I think reading up on > > the > > > > >>>> original discussion thread is also worthwhile [2] to see more > > > context. > > > > >>>> FLIP-27 was also very important as it brought a unified > connector > > > > which > > > > >>> can > > > > >>>> support both streaming and batch (with batch being considered a > > > > special > > > > >>>> case of streaming in Flink's vision). > > > > >>>> > > > > >>>> So +1 to deprecate SourceFunction. I would also argue that we > > should > > > > >>>> already mark the SinkFunction as deprecated to avoid having this > > > > >>> discussion > > > > >>>> again in a couple of months. > > > > >>>> > > > > >>>> Best regards, > > > > >>>> > > > > >>>> Martijn > > > > >>>> > > > > >>>> [1] https://flink.apache.org/roadmap.html > > > > >>>> [2] > > > https://lists.apache.org/thread/334co89dbhc8qpr9nvmz8t1gp4sz2c8y > > > > >>>> > > > > >>>> Op do 9 jun. 2022 om 09:48 schreef Jing Ge <j...@ververica.com > >: > > > > >>>> > > > > >>>>> Hi, > > > > >>>>> > > > > >>>>> I am very happy to see opinions from different perspectives. > That > > > > >> will > > > > >>>> help > > > > >>>>> us understand the problem better. Thanks all for the > informative > > > > >>>>> discussion. > > > > >>>>> > > > > >>>>> Let's see the big picture and check following facts together: > > > > >>>>> > > > > >>>>> 1. FLIP-27 was intended to solve some technical issues that are > > > very > > > > >>>>> difficult to solve with SourceFunction[1]. When we say > > > > >> "SourceFunction > > > > >>> is > > > > >>>>> easy", well, it depends. If we take a look at the > implementation > > of > > > > >> the > > > > >>>>> Kafka connector, we will know how complicated it is to build a > > > > >> serious > > > > >>>>> connector for production with the old SourceFunction. To every > > > > >> problem > > > > >>>>> there is a solution and to every solution there is a problem. > The > > > > >> fact > > > > >>> is > > > > >>>>> that there is no perfect but a feasible solution. If we try to > > > solve > > > > >>>>> complicated problems, we have to expose some complexity. > > Comparing > > > to > > > > >>>>> connectors for POC, demo, training(no offense), I would also > > solve > > > > >>> issues > > > > >>>>> for connectors like Kafka connector that are widely used in > > > > >> production > > > > >>>> with > > > > >>>>> higher priority. I think that should be one reason why FLIP-27 > > has > > > > >> been > > > > >>>>> designed and why the new source API went public. > > > > >>>>> > > > > >>>>> 2. FLIP-27 and the implementation was introduced roughly at the > > end > > > > >> of > > > > >>>> 2019 > > > > >>>>> and went public on 19.04.2021, which means Flink has provided > two > > > > >>>> different > > > > >>>>> public/graduated source solutions for more than one year. On > the > > > day > > > > >>> that > > > > >>>>> the new source API went public, there should be a consensus in > > the > > > > >>>>> community that we should start the migration. Old > SourceFunction > > > > >>>> interface, > > > > >>>>> in the ideal case, should have been deprecated on that day, > > > otherwise > > > > >>> we > > > > >>>>> should not graduate the new source API to avoid confusing > > > (connector) > > > > >>>>> developers[2]. > > > > >>>>> > > > > >>>>> 3. It is true that the new source API is hard to understand and > > > even > > > > >>> hard > > > > >>>>> to implement for simple cases. Thanks for the feedback. That is > > > > >>> something > > > > >>>>> we need to improve. The current design&implementation could be > > > > >>> considered > > > > >>>>> as the low level API. The next step is to create the high level > > API > > > > >> to > > > > >>>>> reduce some unnecessary complexity for those simple cases. But, > > > IMHO, > > > > >>>> this > > > > >>>>> should not be the prerequisite to postpone the deprecation of > the > > > old > > > > >>>>> SourceFunction APIs. > > > > >>>>> > > > > >>>>> 4. As long as the old SourceFunction is not marked as > deprecated, > > > > >>>>> developers will continue asking which one should be used. Let's > > > make > > > > >> a > > > > >>>>> concrete example. If a new connector is developed now and the > > > > >> developer > > > > >>>>> asks for a suggestion of the choice between the old and new > > source > > > > >> API > > > > >>> on > > > > >>>>> the ML, which one should we suggest? I think it should be the > new > > > > >>> Source > > > > >>>>> API. If a fresh new connector has been developed with the old > > > > >>>>> SourceFunction API before asking for the consensus in the > > community > > > > >> and > > > > >>>> the > > > > >>>>> developer wants to merge it to the master. Should we allow it? > If > > > the > > > > >>>>> answer of all these questions is pointing to the new Source > API, > > > the > > > > >>> old > > > > >>>>> SourceFunction is de facto already deprecated, just has not > been > > > > >> marked > > > > >>>> as > > > > >>>>> @deprecated, which confuses developers even more. > > > > >>>>> > > > > >>>>> Best regards, > > > > >>>>> Jing > > > > >>>>> > > > > >>>>> [1] > > > > >>>>> > > > > >>>>> > > > > >>>> > > > > >>> > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > > > >>>>> [2] > > > https://lists.apache.org/thread/7okp4y46n3o3rx5mn0t3qobrof8zxwqs > > > > >>>>> > > > > >>>>> On Wed, Jun 8, 2022 at 2:21 AM Alexander Fedulov < > > > > >>>> alexan...@ververica.com> > > > > >>>>> wrote: > > > > >>>>> > > > > >>>>>> Hey Austin, > > > > >>>>>> > > > > >>>>>> Since we are getting deeper into the implementation details of > > the > > > > >>>>>> DataGeneratorSource > > > > >>>>>> and it is not the main topic of this thread, I propose to move > > our > > > > >>>>>> discussion to where it belongs: [DISCUSS] FLIP-238 [1]. Could > > you > > > > >>>> please > > > > >>>>>> briefly formulate your requirements to make it easier for the > > > > >> others > > > > >>> to > > > > >>>>>> follow? I am happy to continue this conversation there. > > > > >>>>>> > > > > >>>>>> [1] > > > > >> https://lists.apache.org/thread/7gjxto1rmkpff4kl54j8nlg5db2rqhkt > > > > >>>>>> > > > > >>>>>> Best, > > > > >>>>>> Alexander Fedulov > > > > >>>>>> > > > > >>>>>> On Tue, Jun 7, 2022 at 6:14 PM Austin Cawley-Edwards < > > > > >>>>>> austin.caw...@gmail.com> wrote: > > > > >>>>>> > > > > >>>>>>>> @Austin, in the FLIP I mentioned above [1], the user is > > > > >> expected > > > > >>> to > > > > >>>>>>> pass a MapFunction<Long, > > > > >>>>>>> OUT> > > > > >>>>>>> to the generator. I wonder if you could have your external > > client > > > > >>> and > > > > >>>>>>> polling logic wrapped in a custom > > > > >>>>>>> MapFunction implementation class? Would that answer your > needs > > or > > > > >>> do > > > > >>>>> you > > > > >>>>>>> have some > > > > >>>>>>> more sophisticated scenario in mind? > > > > >>>>>>> > > > > >>>>>>> At first glance, the FLIP looks good but for this case in > > regards > > > > >>> to > > > > >>>>> the > > > > >>>>>>> map function, but leaves out 1) ability to control polling > > > > >>> intervals > > > > >>>>> and > > > > >>>>>> 2) > > > > >>>>>>> ability to produce an unknown number of records, both > per-poll > > > > >> and > > > > >>>>>> overall > > > > >>>>>>> boundedness. Do you think something like this could be built > > from > > > > >>> the > > > > >>>>>> same > > > > >>>>>>> pieces? > > > > >>>>>>> I'm also wondering what handles threading, is that on the > user > > or > > > > >>> is > > > > >>>>> that > > > > >>>>>>> part of the DataGeneratorSource? > > > > >>>>>>> > > > > >>>>>>> Best, > > > > >>>>>>> Austin > > > > >>>>>>> > > > > >>>>>>> On Tue, Jun 7, 2022 at 9:34 AM Alexander Fedulov < > > > > >>>>>> alexan...@ververica.com> > > > > >>>>>>> wrote: > > > > >>>>>>> > > > > >>>>>>>> Hi everyone, > > > > >>>>>>>> > > > > >>>>>>>> Thanks for all the input and a lively discussion. It seems > > that > > > > >>>> there > > > > >>>>>> is > > > > >>>>>>> a > > > > >>>>>>>> consensus that due to > > > > >>>>>>>> the inherent complexity of FLIP-27 sources we should provide > > > > >> more > > > > >>>>>>>> user-facing utilities to bridge > > > > >>>>>>>> the gap between the existing SourceFunction-based > > functionality > > > > >>> and > > > > >>>>> the > > > > >>>>>>> new > > > > >>>>>>>> APIs. > > > > >>>>>>>> > > > > >>>>>>>> To start addressing this I picked the issue that David > raised > > > > >> and > > > > >>>>> many > > > > >>>>>>>> upvoted. Here is a proposal > > > > >>>>>>>> for the new DataGeneratorSource: FLIP-238 [1]. Please take > a > > > > >>>> look, I > > > > >>>>>> am > > > > >>>>>>>> going to open a separate > > > > >>>>>>>> discussion thread on it shortly. > > > > >>>>>>>> > > > > >>>>>>>> Jing also raised some great points regarding the interfaces > > and > > > > >>>>>>> subclasses. > > > > >>>>>>>> It seems to me that > > > > >>>>>>>> what might actually help is some sort of a "soft > deprecation" > > > > >>>> concept > > > > >>>>>> and > > > > >>>>>>>> annotation. It could be > > > > >>>>>>>> used in places where we do not have an alternative > > > > >> implementation > > > > >>>>> yet, > > > > >>>>>>> but > > > > >>>>>>>> we clearly want > > > > >>>>>>>> to indicate that continuing to build on top of these > > interfaces > > > > >>> is > > > > >>>>>>>> discouraged. The area of > > > > >>>>>>>> impact of deprecating all SourceFunction subclasses is > rather > > > > >>> big, > > > > >>>>> and > > > > >>>>>> we > > > > >>>>>>>> can expect it to > > > > >>>>>>>> take a while. The hope would be that if in the meantime > > someone > > > > >>>> finds > > > > >>>>>>>> themselves using one of > > > > >>>>>>>> such old APIs, the "soft deprecation" annotation will be a > > > > >> clear > > > > >>>>>>> indication > > > > >>>>>>>> and encouragement to > > > > >>>>>>>> work on introducing an alternative FLIP-27-based > > implementation > > > > >>>>>> instead. > > > > >>>>>>>> > > > > >>>>>>>> @Austin, in the FLIP I mentioned above [1], the user is > > > > >> expected > > > > >>> to > > > > >>>>>>>> pass a MapFunction<Long, > > > > >>>>>>>> OUT> > > > > >>>>>>>> to the generator. I wonder if you could have your external > > > > >> client > > > > >>>> and > > > > >>>>>>>> polling logic wrapped in a custom > > > > >>>>>>>> MapFunction implementation class? Would that answer your > needs > > > > >> or > > > > >>>> do > > > > >>>>>> you > > > > >>>>>>>> have some > > > > >>>>>>>> more sophisticated scenario in mind? > > > > >>>>>>>> > > > > >>>>>>>> [1] https://cwiki.apache.org/confluence/x/9Av1D > > > > >>>>>>>> Best, > > > > >>>>>>>> Alexander Fedulov > > > > >>>>>>>> > > > > >>>>>>>> On Mon, Jun 6, 2022 at 7:08 PM Austin Cawley-Edwards < > > > > >>>>>>>> austin.caw...@gmail.com> wrote: > > > > >>>>>>>> > > > > >>>>>>>>> Thanks for the nice discussion all. > > > > >>>>>>>>> > > > > >>>>>>>>> I was recently trying to implement a very simple polling > > > > >> source > > > > >>>> and > > > > >>>>>>>>> would've loved a higher-level base to work from. I'm > > > > >> wondering > > > > >>> if > > > > >>>>> in > > > > >>>>>>>>> addition to the data generator use cases, it would be good > to > > > > >>>>>> support a > > > > >>>>>>>>> simple non-parallel polling abstraction to make it easier > to, > > > > >>> for > > > > >>>>>>>> instance, > > > > >>>>>>>>> start prototyping with data in existing APIs without > adding a > > > > >>>> Kafka > > > > >>>>>> or > > > > >>>>>>>> such > > > > >>>>>>>>> in the middle. > > > > >>>>>>>>> > > > > >>>>>>>>> Best, > > > > >>>>>>>>> Austin > > > > >>>>>>>>> > > > > >>>>>>>>> On Mon, Jun 6, 2022 at 10:02 AM tison < > wander4...@gmail.com> > > > > >>>>> wrote: > > > > >>>>>>>>> > > > > >>>>>>>>>> Well. It's a bit off-topic. For deprecating SourceFunction > > > > >> as > > > > >>>>>> FLIP-27 > > > > >>>>>>>>>> series works go ahead, +1 from my side. It's a significant > > > > >>> work > > > > >>>>>>> towards > > > > >>>>>>>>> the > > > > >>>>>>>>>> unification of batch and streaming effort :) > > > > >>>>>>>>>> > > > > >>>>>>>>>> Best, > > > > >>>>>>>>>> tison. > > > > >>>>>>>>>> > > > > >>>>>>>>>> > > > > >>>>>>>>>> tison <wander4...@gmail.com> 于2022年6月6日周一 21:54写道: > > > > >>>>>>>>>> > > > > >>>>>>>>>>> The starting point of the version bump and removal > > > > >> question > > > > >>>> is > > > > >>>>>> that > > > > >>>>>>>>>>> downstream projects may experience a tough time to adapt > > > > >>> new > > > > >>>>>>>> interfaces > > > > >>>>>>>>>>> while Flink keeps in 1.x versions so that users may > > > > >> expect > > > > >>> it > > > > >>>>> as > > > > >>>>>> an > > > > >>>>>>>>> easy > > > > >>>>>>>>>>> task. From my experience, it's really challenge to > > > > >> maintain > > > > >>>>>>>>>>> compatibility between multiple versions of Flink while > > > > >>>>>> significant > > > > >>>>>>>>>> changes > > > > >>>>>>>>>>> made but sharing 1.x version series - users may not be > > > > >>> aware > > > > >>>>> that > > > > >>>>>>>> it's > > > > >>>>>>>>>>> almost a major version bump. > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> Best, > > > > >>>>>>>>>>> tison. > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> tison <wander4...@gmail.com> 于2022年6月6日周一 21:51写道: > > > > >>>>>>>>>>> > > > > >>>>>>>>>>>> One question from my side: > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> As SourceFunction a @Public interface, we cannot remove > > > > >> it > > > > >>>>>> before > > > > >>>>>>>>> doing > > > > >>>>>>>>>>>> a major version bump (Flink 2.0). > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> Of course it's not a blocker to make such deprecation > > > > >> and > > > > >>>> let > > > > >>>>>> the > > > > >>>>>>>> new > > > > >>>>>>>>>>>> interface step in. My question is whether we have a plan > > > > >>> to > > > > >>>>>>> finally > > > > >>>>>>>>>> remove > > > > >>>>>>>>>>>> the deprecated interfaces, or postpone it until a clear > > > > >>> plan > > > > >>>>> of > > > > >>>>>>>> Flink > > > > >>>>>>>>>> 2.0? > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> Best, > > > > >>>>>>>>>>>> tison. > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> David Anderson <dander...@apache.org> 于2022年6月6日周一 > > > > >>> 21:35写道: > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> David, can you elaborate why you need watermark > > > > >>>> generation > > > > >>>>> in > > > > >>>>>>> the > > > > >>>>>>>>>>>>> source > > > > >>>>>>>>>>>>>> for your data generators? > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> The training exercises should strive to provide > > > > >> examples > > > > >>> of > > > > >>>>>> best > > > > >>>>>>>>>>>>> practices. > > > > >>>>>>>>>>>>> If the exercises and their solutions use > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> env.fromSource(source, > > > > >> WatermarkStrategy.noWatermarks(), > > > > >>>>>>>>>>>>> "name-of-source") > > > > >>>>>>>>>>>>> .map(...) > > > > >>>>>>>>>>>>> .assignTimestampsAndWatermarks(...) > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> this will help establish this anti-pattern as the > > > > >> normal > > > > >>>> way > > > > >>>>> of > > > > >>>>>>>> doing > > > > >>>>>>>>>>>>> things. > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> Most new Flink users are using a KafkaSource with a > > > > >>>>>> noWatermarks > > > > >>>>>>>>>> strategy > > > > >>>>>>>>>>>>> and a SimpleStringSchema, followed by a map that does > > > > >> the > > > > >>>>> real > > > > >>>>>>>>>>>>> deserialization, followed by the real watermarking -- > > > > >>>> because > > > > >>>>>>> they > > > > >>>>>>>>>> aren't > > > > >>>>>>>>>>>>> seeing examples that teach how these interfaces are > > > > >> meant > > > > >>>> to > > > > >>>>> be > > > > >>>>>>>> used. > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> When we redo the sources used in training exercises, I > > > > >>> want > > > > >>>>> to > > > > >>>>>>>> avoid > > > > >>>>>>>>>>>>> these > > > > >>>>>>>>>>>>> pitfalls. > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> David > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> On Mon, Jun 6, 2022 at 9:12 AM Konstantin Knauf < > > > > >>>>>>> kna...@apache.org > > > > >>>>>>>>> > > > > >>>>>>>>>>>>> wrote: > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> Hi everyone, > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> very interesting thread. The proposal for deprecation > > > > >>>> seems > > > > >>>>>> to > > > > >>>>>>>> have > > > > >>>>>>>>>>>>> sparked > > > > >>>>>>>>>>>>>> a very important discussion. Do we what users > > > > >> struggle > > > > >>>> with > > > > >>>>>>>>>>>>> specifically? > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> Speaking for myself, when I upgrade flink-faker to > > > > >> the > > > > >>>> new > > > > >>>>>>> Source > > > > >>>>>>>>> API > > > > >>>>>>>>>>>>> an > > > > >>>>>>>>>>>>>> unbounded version of the NumberSequenceSource would > > > > >>> have > > > > >>>>> been > > > > >>>>>>>> all I > > > > >>>>>>>>>>>>> needed, > > > > >>>>>>>>>>>>>> but that's just the data generator use case. I think, > > > > >>>> that > > > > >>>>>> one > > > > >>>>>>>>> could > > > > >>>>>>>>>> be > > > > >>>>>>>>>>>>>> solved quite easily. David, can you elaborate why you > > > > >>>> need > > > > >>>>>>>>> watermark > > > > >>>>>>>>>>>>>> generation in the source for your data generators? > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> Cheers, > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> Konstantin > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> Am So., 5. Juni 2022 um 17:48 Uhr schrieb Piotr > > > > >>> Nowojski > > > > >>>> < > > > > >>>>>>>>>>>>>> pnowoj...@apache.org>: > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> Also +1 to what David has written. But it doesn't > > > > >>> mean > > > > >>>> we > > > > >>>>>>>> should > > > > >>>>>>>>> be > > > > >>>>>>>>>>>>>> waiting > > > > >>>>>>>>>>>>>>> indefinitely to deprecate SourceFunction. > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> Best, > > > > >>>>>>>>>>>>>>> Piotrek > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> niedz., 5 cze 2022 o 16:46 Jark Wu < > > > > >> imj...@gmail.com > > > > >>>> > > > > >>>>>>>>> napisał(a): > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>> +1 to David's point. > > > > >>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>> Usually, when we deprecate some interfaces, we > > > > >>> should > > > > >>>>>> point > > > > >>>>>>>>> users > > > > >>>>>>>>>>>>> to > > > > >>>>>>>>>>>>>> use > > > > >>>>>>>>>>>>>>>> the recommended alternatives. > > > > >>>>>>>>>>>>>>>> However, implementing the new Source interface > > > > >> for > > > > >>>> some > > > > >>>>>>>> simple > > > > >>>>>>>>>>>>>> scenarios > > > > >>>>>>>>>>>>>>> is > > > > >>>>>>>>>>>>>>>> too challenging and complex. > > > > >>>>>>>>>>>>>>>> We also found it isn't easy to push the internal > > > > >>>>>> connector > > > > >>>>>>> to > > > > >>>>>>>>>>>>> upgrade > > > > >>>>>>>>>>>>>> to > > > > >>>>>>>>>>>>>>>> the new Source because > > > > >>>>>>>>>>>>>>>> "FLIP-27 are hard to understand, while > > > > >>> SourceFunction > > > > >>>>> is > > > > >>>>>>>> easy". > > > > >>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>> +1 to make implementing a simple Source easier > > > > >>> before > > > > >>>>>>>>> deprecating > > > > >>>>>>>>>>>>>>>> SourceFunction. > > > > >>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>> Best, > > > > >>>>>>>>>>>>>>>> Jark > > > > >>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>> On Sun, 5 Jun 2022 at 07:29, Jingsong Lee < > > > > >>>>>>>>>> lzljs3620...@apache.org > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> wrote: > > > > >>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> +1 to David and Ingo. > > > > >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> Before deprecate and remove SourceFunction, we > > > > >>>> should > > > > >>>>>>> have > > > > >>>>>>>>> some > > > > >>>>>>>>>>>>>> easier > > > > >>>>>>>>>>>>>>>> APIs > > > > >>>>>>>>>>>>>>>>> to wrap new Source, the cost to write a new > > > > >>> Source > > > > >>>> is > > > > >>>>>> too > > > > >>>>>>>>> high > > > > >>>>>>>>>>>>> now. > > > > >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> Ingo Bürk <airbla...@apache.org>于2022年6月5日 > > > > >>>>> 周日05:32写道: > > > > >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>> I +1 everything David said. The new Source > > > > >> API > > > > >>>>> raised > > > > >>>>>>> the > > > > >>>>>>>>>>>>>> complexity > > > > >>>>>>>>>>>>>>>>>> significantly. It's great to have such a > > > > >> rich, > > > > >>>>>> powerful > > > > >>>>>>>> API > > > > >>>>>>>>>>>>> that > > > > >>>>>>>>>>>>>> can > > > > >>>>>>>>>>>>>>> do > > > > >>>>>>>>>>>>>>>>>> everything, but in the process we lost the > > > > >>>> ability > > > > >>>>> to > > > > >>>>>>>>> onboard > > > > >>>>>>>>>>>>>> people > > > > >>>>>>>>>>>>>>> to > > > > >>>>>>>>>>>>>>>>>> the APIs. > > > > >>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>> Best > > > > >>>>>>>>>>>>>>>>>> Ingo > > > > >>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>> On 04.06.22 21:21, David Anderson wrote: > > > > >>>>>>>>>>>>>>>>>>> I'm in favor of this, but I think we need > > > > >> to > > > > >>>> make > > > > >>>>>> it > > > > >>>>>>>>> easier > > > > >>>>>>>>>>>>> to > > > > >>>>>>>>>>>>>>>>> implement > > > > >>>>>>>>>>>>>>>>>>> data generators and test sources. As things > > > > >>>> stand > > > > >>>>>> in > > > > >>>>>>>>> 1.15, > > > > >>>>>>>>>>>>> unless > > > > >>>>>>>>>>>>>>> you > > > > >>>>>>>>>>>>>>>>> can > > > > >>>>>>>>>>>>>>>>>>> be satisfied with using a > > > > >>> NumberSequenceSource > > > > >>>>>>> followed > > > > >>>>>>>>> by > > > > >>>>>>>>>> a > > > > >>>>>>>>>>>>> map, > > > > >>>>>>>>>>>>>>>>> things > > > > >>>>>>>>>>>>>>>>>>> get quite complicated. I looked into > > > > >>> reworking > > > > >>>>> the > > > > >>>>>>> data > > > > >>>>>>>>>>>>>> generators > > > > >>>>>>>>>>>>>>>> used > > > > >>>>>>>>>>>>>>>>>> in > > > > >>>>>>>>>>>>>>>>>>> the training exercises, and got discouraged > > > > >>> by > > > > >>>>> the > > > > >>>>>>>> amount > > > > >>>>>>>>>> of > > > > >>>>>>>>>>>>> work > > > > >>>>>>>>>>>>>>>>>> involved. > > > > >>>>>>>>>>>>>>>>>>> (The sources used in the training want to > > > > >> be > > > > >>>>>>> unbounded, > > > > >>>>>>>>> and > > > > >>>>>>>>>>>>> need > > > > >>>>>>>>>>>>>>>>>>> watermarking in the sources, which means > > > > >> that > > > > >>>>> using > > > > >>>>>>>>>>>>>>>>> NumberSequenceSource > > > > >>>>>>>>>>>>>>>>>>> isn't an option.) > > > > >>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>> I think the proposed deprecation will be > > > > >>> better > > > > >>>>>>>> received > > > > >>>>>>>>> if > > > > >>>>>>>>>>>>> it > > > > >>>>>>>>>>>>>> can > > > > >>>>>>>>>>>>>>> be > > > > >>>>>>>>>>>>>>>>>>> accompanied by something that makes > > > > >>>> implementing > > > > >>>>> a > > > > >>>>>>>> simple > > > > >>>>>>>>>>>>> Source > > > > >>>>>>>>>>>>>>>> easier > > > > >>>>>>>>>>>>>>>>>>> than it is now. People are continuing to > > > > >>>>> implement > > > > >>>>>>> new > > > > >>>>>>>>>>>>>>>> SourceFunctions > > > > >>>>>>>>>>>>>>>>>>> because the interfaces defined by FLIP-27 > > > > >> are > > > > >>>>> hard > > > > >>>>>> to > > > > >>>>>>>>>>>>> understand, > > > > >>>>>>>>>>>>>>>> while > > > > >>>>>>>>>>>>>>>>>>> SourceFunction is easy. Alex, I believe you > > > > >>>> were > > > > >>>>>>>> looking > > > > >>>>>>>>>> into > > > > >>>>>>>>>>>>>>>>>> implementing > > > > >>>>>>>>>>>>>>>>>>> an easier-to-use building block that could > > > > >> be > > > > >>>>> used > > > > >>>>>> in > > > > >>>>>>>>>>>>> situations > > > > >>>>>>>>>>>>>>> like > > > > >>>>>>>>>>>>>>>>>> this. > > > > >>>>>>>>>>>>>>>>>>> Can we get something like that in place > > > > >>> first? > > > > >>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>> David > > > > >>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>> On Fri, Jun 3, 2022 at 4:52 PM Jing Ge < > > > > >>>>>>>>> j...@ververica.com > > > > >>>>>>>>>>> > > > > >>>>>>>>>>>>>> wrote: > > > > >>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> Hi, > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> Thanks Alex for driving this! > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> +1 To give the Flink developers, > > > > >> especially > > > > >>>>>>> Connector > > > > >>>>>>>>>>>>> developers > > > > >>>>>>>>>>>>>>> the > > > > >>>>>>>>>>>>>>>>>> clear > > > > >>>>>>>>>>>>>>>>>>>> signal that the new Source API is > > > > >>> recommended > > > > >>>>>>>> according > > > > >>>>>>>>> to > > > > >>>>>>>>>>>>>>> FLIP-27, > > > > >>>>>>>>>>>>>>>> we > > > > >>>>>>>>>>>>>>>>>>>> should mark them as deprecated. > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> There are some open questions to discuss: > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> 1. Do we need to mark all > > > > >>>>> subinterfaces/subclasses > > > > >>>>>>> as > > > > >>>>>>>>>>>>>> deprecated? > > > > >>>>>>>>>>>>>>>> e.g. > > > > >>>>>>>>>>>>>>>>>>>> FromElementsFunction, etc. there are many. > > > > >>>> What > > > > >>>>>> are > > > > >>>>>>>> the > > > > >>>>>>>>>>>>>>>> replacements? > > > > >>>>>>>>>>>>>>>>>>>> 2. Do we need to mark all subclasses that > > > > >>> have > > > > >>>>>>>>> replacement > > > > >>>>>>>>>>>>> as > > > > >>>>>>>>>>>>>>>>>> deprecated? > > > > >>>>>>>>>>>>>>>>>>>> e.g. ExternallyInducedSource whose > > > > >>> replacement > > > > >>>>>>> class, > > > > >>>>>>>>> if I > > > > >>>>>>>>>>>>> am > > > > >>>>>>>>>>>>>> not > > > > >>>>>>>>>>>>>>>>>> mistaken, > > > > >>>>>>>>>>>>>>>>>>>> ExternallyInducedSourceReader is > > > > >>> @Experimental > > > > >>>>>>>>>>>>>>>>>>>> 3. Do we need to mark all related test > > > > >>> utility > > > > >>>>>>> classes > > > > >>>>>>>>> as > > > > >>>>>>>>>>>>>>>> deprecated? > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> I think it might make sense to create an > > > > >>>>> umbrella > > > > >>>>>>>> ticket > > > > >>>>>>>>>> to > > > > >>>>>>>>>>>>>> cover > > > > >>>>>>>>>>>>>>>> all > > > > >>>>>>>>>>>>>>>>> of > > > > >>>>>>>>>>>>>>>>>>>> these with the following process: > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> 1. Mark SourceFunction as deprecated asap. > > > > >>>>>>>>>>>>>>>>>>>> 2. Mark subinterfaces and subclasses as > > > > >>>>>> deprecated, > > > > >>>>>>> if > > > > >>>>>>>>>>>>> there are > > > > >>>>>>>>>>>>>>>>>> graduated > > > > >>>>>>>>>>>>>>>>>>>> replacements. Good example is that > > > > >>> KafkaSource > > > > >>>>>>>> replaced > > > > >>>>>>>>>>>>>>>> KafkaConsumer > > > > >>>>>>>>>>>>>>>>>> which > > > > >>>>>>>>>>>>>>>>>>>> has been marked as deprecated. > > > > >>>>>>>>>>>>>>>>>>>> 3. Do not mark subinterfaces and > > > > >> subclasses > > > > >>> as > > > > >>>>>>>>> deprecated, > > > > >>>>>>>>>>>>> if > > > > >>>>>>>>>>>>>>>>>> replacement > > > > >>>>>>>>>>>>>>>>>>>> classes are still experimental, check if > > > > >> it > > > > >>> is > > > > >>>>>> time > > > > >>>>>>> to > > > > >>>>>>>>>>>>> graduate > > > > >>>>>>>>>>>>>>>> them. > > > > >>>>>>>>>>>>>>>>>> After > > > > >>>>>>>>>>>>>>>>>>>> graduation, go to step 2. It might take a > > > > >>>> while > > > > >>>>>> for > > > > >>>>>>>>>>>>> graduation. > > > > >>>>>>>>>>>>>>>>>>>> 4. Do not mark subinterfaces and > > > > >> subclasses > > > > >>> as > > > > >>>>>>>>> deprecated, > > > > >>>>>>>>>>>>> if > > > > >>>>>>>>>>>>>> the > > > > >>>>>>>>>>>>>>>>>>>> replacement classes are experimental and > > > > >> are > > > > >>>> too > > > > >>>>>>> young > > > > >>>>>>>>> to > > > > >>>>>>>>>>>>>>> graduate. > > > > >>>>>>>>>>>>>>>> We > > > > >>>>>>>>>>>>>>>>>> have > > > > >>>>>>>>>>>>>>>>>>>> to wait. But in this case we could create > > > > >>> new > > > > >>>>>>> tickets > > > > >>>>>>>>>> under > > > > >>>>>>>>>>>>> the > > > > >>>>>>>>>>>>>>>>> umbrella > > > > >>>>>>>>>>>>>>>>>>>> ticket. > > > > >>>>>>>>>>>>>>>>>>>> 5. Do not mark subinterfaces and > > > > >> subclasses > > > > >>> as > > > > >>>>>>>>> deprecated, > > > > >>>>>>>>>>>>> if > > > > >>>>>>>>>>>>>>> there > > > > >>>>>>>>>>>>>>>> is > > > > >>>>>>>>>>>>>>>>>> no > > > > >>>>>>>>>>>>>>>>>>>> replacement at all. We have to create new > > > > >>>>> tickets > > > > >>>>>>> and > > > > >>>>>>>>> wait > > > > >>>>>>>>>>>>> until > > > > >>>>>>>>>>>>>>> the > > > > >>>>>>>>>>>>>>>>> new > > > > >>>>>>>>>>>>>>>>>>>> implementation has been done and > > > > >> graduated. > > > > >>> It > > > > >>>>>> will > > > > >>>>>>>>> take a > > > > >>>>>>>>>>>>>> longer > > > > >>>>>>>>>>>>>>>>> time, > > > > >>>>>>>>>>>>>>>>>>>> roughly 1,5 years. > > > > >>>>>>>>>>>>>>>>>>>> 6. For test classes, we could follow the > > > > >>> same > > > > >>>>>> rule. > > > > >>>>>>>> But > > > > >>>>>>>>> I > > > > >>>>>>>>>>>>> think > > > > >>>>>>>>>>>>>>> for > > > > >>>>>>>>>>>>>>>>> some > > > > >>>>>>>>>>>>>>>>>>>> cases, we could consider doing the > > > > >>> replacement > > > > >>>>>>>> directly > > > > >>>>>>>>>>>>> without > > > > >>>>>>>>>>>>>>>> going > > > > >>>>>>>>>>>>>>>>>>>> through the deprecation phase. > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> When we look back on all of these, we can > > > > >>>>> realize > > > > >>>>>> it > > > > >>>>>>>> is > > > > >>>>>>>>> a > > > > >>>>>>>>>>>>> big > > > > >>>>>>>>>>>>>> epic > > > > >>>>>>>>>>>>>>>>> (even > > > > >>>>>>>>>>>>>>>>>>>> bigger than an epic). It needs someone to > > > > >>>> drive > > > > >>>>> it > > > > >>>>>>> and > > > > >>>>>>>>>> keep > > > > >>>>>>>>>>>>>> focus > > > > >>>>>>>>>>>>>>> on > > > > >>>>>>>>>>>>>>>>> it > > > > >>>>>>>>>>>>>>>>>>>> continuously with support from the > > > > >> community > > > > >>>> and > > > > >>>>>>> push > > > > >>>>>>>>> the > > > > >>>>>>>>>>>>>>>> development > > > > >>>>>>>>>>>>>>>>>>>> towards the new Source API of FLIP-27. > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> If we could have consensus for this, Alex > > > > >>>> and I > > > > >>>>>>> could > > > > >>>>>>>>>>>>> create > > > > >>>>>>>>>>>>>> the > > > > >>>>>>>>>>>>>>>>>> umbrella > > > > >>>>>>>>>>>>>>>>>>>> ticket to kick it off. > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> Best regards, > > > > >>>>>>>>>>>>>>>>>>>> Jing > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> On Fri, Jun 3, 2022 at 3:54 PM Alexander > > > > >>>>> Fedulov < > > > > >>>>>>>>>>>>>>>>>> alexan...@ververica.com> > > > > >>>>>>>>>>>>>>>>>>>> wrote: > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>>> Hi everyone, > > > > >>>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>>> I would like to start the discussion > > > > >> about > > > > >>>>>> marking > > > > >>>>>>>>>>>>>>>>> SourceFunction-based > > > > >>>>>>>>>>>>>>>>>>>>> interfaces as deprecated. With the > > > > >> FLIP-27 > > > > >>>> APIs > > > > >>>>>>>>> becoming > > > > >>>>>>>>>>>>> the > > > > >>>>>>>>>>>>>> new > > > > >>>>>>>>>>>>>>>>>>>> standard, > > > > >>>>>>>>>>>>>>>>>>>>> the old ones have to be eventually phased > > > > >>>> out. > > > > >>>>>>>> Although > > > > >>>>>>>>>>>>> this > > > > >>>>>>>>>>>>>>> state > > > > >>>>>>>>>>>>>>>> is > > > > >>>>>>>>>>>>>>>>>>>> well > > > > >>>>>>>>>>>>>>>>>>>>> known within the community and no new > > > > >>>>> connectors > > > > >>>>>>>> based > > > > >>>>>>>>> on > > > > >>>>>>>>>>>>> the > > > > >>>>>>>>>>>>>> old > > > > >>>>>>>>>>>>>>>>>>>>> interfaces can be accepted into the > > > > >>> project, > > > > >>>>> the > > > > >>>>>>>>>> footprint > > > > >>>>>>>>>>>>> of > > > > >>>>>>>>>>>>>>>>>>>>> SourceFunction in the user code still > > > > >> keeps > > > > >>>>>> growing > > > > >>>>>>>>>>>>> (primarily > > > > >>>>>>>>>>>>>>> for > > > > >>>>>>>>>>>>>>>>> data > > > > >>>>>>>>>>>>>>>>>>>>> generators and test utilities). I believe > > > > >>> it > > > > >>>> is > > > > >>>>>>> best > > > > >>>>>>>> to > > > > >>>>>>>>>>>>> mark > > > > >>>>>>>>>>>>>>>>>>>> SourceFunction > > > > >>>>>>>>>>>>>>>>>>>>> as deprecated as soon as possible. What > > > > >> do > > > > >>>> you > > > > >>>>>>> think? > > > > >>>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>>> Best, > > > > >>>>>>>>>>>>>>>>>>>>> Alexander Fedulov > > > > >>>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> -- > > > > >>>>>>>>>>>>>> https://twitter.com/snntrable > > > > >>>>>>>>>>>>>> https://github.com/knaufk > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>> > > > > >>>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>> > > > > >>>>>>> > > > > >>>>>> > > > > >>>>> > > > > >>>> > > > > >>> > > > > >> > > > > > > > > > > > > > > > > > > -- > > https://twitter.com/snntrable > > https://github.com/knaufk > > >