+1 to deprecate. +1 for David’s points. I’ve one related question, do we have any plan to offer a lighter Source API to decrease the connector development cost?
New Source API is good but too heavy for use cases like tests or even some simple connectors. Best, Leonard > On Jun 6, 2022, at 9:51 PM, tison <wander4...@gmail.com> wrote: > > One question from my side: > > As SourceFunction a @Public interface, we cannot remove it before doing a > major version bump (Flink 2.0). > > Of course it's not a blocker to make such deprecation and let the new > interface step in. My question is whether we have a plan to finally remove > the deprecated interfaces, or postpone it until a clear plan of Flink 2.0? > > Best, > tison. > > > David Anderson <dander...@apache.org> 于2022年6月6日周一 21:35写道: > >>> >>> David, can you elaborate why you need watermark generation in the source >>> for your data generators? >> >> >> The training exercises should strive to provide examples of best practices. >> If the exercises and their solutions use >> >> env.fromSource(source, WatermarkStrategy.noWatermarks(), "name-of-source") >> .map(...) >> .assignTimestampsAndWatermarks(...) >> >> this will help establish this anti-pattern as the normal way of doing >> things. >> >> Most new Flink users are using a KafkaSource with a noWatermarks strategy >> and a SimpleStringSchema, followed by a map that does the real >> deserialization, followed by the real watermarking -- because they aren't >> seeing examples that teach how these interfaces are meant to be used. >> >> When we redo the sources used in training exercises, I want to avoid these >> pitfalls. >> >> David >> >> On Mon, Jun 6, 2022 at 9:12 AM Konstantin Knauf <kna...@apache.org> wrote: >> >>> Hi everyone, >>> >>> very interesting thread. The proposal for deprecation seems to have >> sparked >>> a very important discussion. Do we what users struggle with specifically? >>> >>> Speaking for myself, when I upgrade flink-faker to the new Source API an >>> unbounded version of the NumberSequenceSource would have been all I >> needed, >>> but that's just the data generator use case. I think, that one could be >>> solved quite easily. David, can you elaborate why you need watermark >>> generation in the source for your data generators? >>> >>> Cheers, >>> >>> Konstantin >>> >>> >>> >>> >>> >>> Am So., 5. Juni 2022 um 17:48 Uhr schrieb Piotr Nowojski < >>> pnowoj...@apache.org>: >>> >>>> Also +1 to what David has written. But it doesn't mean we should be >>> waiting >>>> indefinitely to deprecate SourceFunction. >>>> >>>> Best, >>>> Piotrek >>>> >>>> niedz., 5 cze 2022 o 16:46 Jark Wu <imj...@gmail.com> napisał(a): >>>> >>>>> +1 to David's point. >>>>> >>>>> Usually, when we deprecate some interfaces, we should point users to >>> use >>>>> the recommended alternatives. >>>>> However, implementing the new Source interface for some simple >>> scenarios >>>> is >>>>> too challenging and complex. >>>>> We also found it isn't easy to push the internal connector to upgrade >>> to >>>>> the new Source because >>>>> "FLIP-27 are hard to understand, while SourceFunction is easy". >>>>> >>>>> +1 to make implementing a simple Source easier before deprecating >>>>> SourceFunction. >>>>> >>>>> Best, >>>>> Jark >>>>> >>>>> >>>>> On Sun, 5 Jun 2022 at 07:29, Jingsong Lee <lzljs3620...@apache.org> >>>> wrote: >>>>> >>>>>> +1 to David and Ingo. >>>>>> >>>>>> Before deprecate and remove SourceFunction, we should have some >>> easier >>>>> APIs >>>>>> to wrap new Source, the cost to write a new Source is too high now. >>>>>> >>>>>> >>>>>> >>>>>> Ingo Bürk <airbla...@apache.org>于2022年6月5日 周日05:32写道: >>>>>> >>>>>>> I +1 everything David said. The new Source API raised the >>> complexity >>>>>>> significantly. It's great to have such a rich, powerful API that >>> can >>>> do >>>>>>> everything, but in the process we lost the ability to onboard >>> people >>>> to >>>>>>> the APIs. >>>>>>> >>>>>>> >>>>>>> Best >>>>>>> Ingo >>>>>>> >>>>>>> On 04.06.22 21:21, David Anderson wrote: >>>>>>>> I'm in favor of this, but I think we need to make it easier to >>>>>> implement >>>>>>>> data generators and test sources. As things stand in 1.15, >> unless >>>> you >>>>>> can >>>>>>>> be satisfied with using a NumberSequenceSource followed by a >> map, >>>>>> things >>>>>>>> get quite complicated. I looked into reworking the data >>> generators >>>>> used >>>>>>> in >>>>>>>> the training exercises, and got discouraged by the amount of >> work >>>>>>> involved. >>>>>>>> (The sources used in the training want to be unbounded, and >> need >>>>>>>> watermarking in the sources, which means that using >>>>>> NumberSequenceSource >>>>>>>> isn't an option.) >>>>>>>> >>>>>>>> I think the proposed deprecation will be better received if it >>> can >>>> be >>>>>>>> accompanied by something that makes implementing a simple >> Source >>>>> easier >>>>>>>> than it is now. People are continuing to implement new >>>>> SourceFunctions >>>>>>>> because the interfaces defined by FLIP-27 are hard to >> understand, >>>>> while >>>>>>>> SourceFunction is easy. Alex, I believe you were looking into >>>>>>> implementing >>>>>>>> an easier-to-use building block that could be used in >> situations >>>> like >>>>>>> this. >>>>>>>> Can we get something like that in place first? >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> On Fri, Jun 3, 2022 at 4:52 PM Jing Ge <j...@ververica.com> >>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Thanks Alex for driving this! >>>>>>>>> >>>>>>>>> +1 To give the Flink developers, especially Connector >> developers >>>> the >>>>>>> clear >>>>>>>>> signal that the new Source API is recommended according to >>>> FLIP-27, >>>>> we >>>>>>>>> should mark them as deprecated. >>>>>>>>> >>>>>>>>> There are some open questions to discuss: >>>>>>>>> >>>>>>>>> 1. Do we need to mark all subinterfaces/subclasses as >>> deprecated? >>>>> e.g. >>>>>>>>> FromElementsFunction, etc. there are many. What are the >>>>> replacements? >>>>>>>>> 2. Do we need to mark all subclasses that have replacement as >>>>>>> deprecated? >>>>>>>>> e.g. ExternallyInducedSource whose replacement class, if I am >>> not >>>>>>> mistaken, >>>>>>>>> ExternallyInducedSourceReader is @Experimental >>>>>>>>> 3. Do we need to mark all related test utility classes as >>>>> deprecated? >>>>>>>>> >>>>>>>>> I think it might make sense to create an umbrella ticket to >>> cover >>>>> all >>>>>> of >>>>>>>>> these with the following process: >>>>>>>>> >>>>>>>>> 1. Mark SourceFunction as deprecated asap. >>>>>>>>> 2. Mark subinterfaces and subclasses as deprecated, if there >> are >>>>>>> graduated >>>>>>>>> replacements. Good example is that KafkaSource replaced >>>>> KafkaConsumer >>>>>>> which >>>>>>>>> has been marked as deprecated. >>>>>>>>> 3. Do not mark subinterfaces and subclasses as deprecated, if >>>>>>> replacement >>>>>>>>> classes are still experimental, check if it is time to >> graduate >>>>> them. >>>>>>> After >>>>>>>>> graduation, go to step 2. It might take a while for >> graduation. >>>>>>>>> 4. Do not mark subinterfaces and subclasses as deprecated, if >>> the >>>>>>>>> replacement classes are experimental and are too young to >>>> graduate. >>>>> We >>>>>>> have >>>>>>>>> to wait. But in this case we could create new tickets under >> the >>>>>> umbrella >>>>>>>>> ticket. >>>>>>>>> 5. Do not mark subinterfaces and subclasses as deprecated, if >>>> there >>>>> is >>>>>>> no >>>>>>>>> replacement at all. We have to create new tickets and wait >> until >>>> the >>>>>> new >>>>>>>>> implementation has been done and graduated. It will take a >>> longer >>>>>> time, >>>>>>>>> roughly 1,5 years. >>>>>>>>> 6. For test classes, we could follow the same rule. But I >> think >>>> for >>>>>> some >>>>>>>>> cases, we could consider doing the replacement directly >> without >>>>> going >>>>>>>>> through the deprecation phase. >>>>>>>>> >>>>>>>>> When we look back on all of these, we can realize it is a big >>> epic >>>>>> (even >>>>>>>>> bigger than an epic). It needs someone to drive it and keep >>> focus >>>> on >>>>>> it >>>>>>>>> continuously with support from the community and push the >>>>> development >>>>>>>>> towards the new Source API of FLIP-27. >>>>>>>>> >>>>>>>>> If we could have consensus for this, Alex and I could create >>> the >>>>>>> umbrella >>>>>>>>> ticket to kick it off. >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Jing >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Jun 3, 2022 at 3:54 PM Alexander Fedulov < >>>>>>> alexan...@ververica.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi everyone, >>>>>>>>>> >>>>>>>>>> I would like to start the discussion about marking >>>>>> SourceFunction-based >>>>>>>>>> interfaces as deprecated. With the FLIP-27 APIs becoming the >>> new >>>>>>>>> standard, >>>>>>>>>> the old ones have to be eventually phased out. Although this >>>> state >>>>> is >>>>>>>>> well >>>>>>>>>> known within the community and no new connectors based on the >>> old >>>>>>>>>> interfaces can be accepted into the project, the footprint of >>>>>>>>>> SourceFunction in the user code still keeps growing >> (primarily >>>> for >>>>>> data >>>>>>>>>> generators and test utilities). I believe it is best to mark >>>>>>>>> SourceFunction >>>>>>>>>> as deprecated as soon as possible. What do you think? >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Alexander Fedulov >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> https://twitter.com/snntrable >>> https://github.com/knaufk >>> >>