>
> David, can you elaborate why you need watermark generation in the source
> for your data generators?


The training exercises should strive to provide examples of best practices.
If the exercises and their solutions use

env.fromSource(source, WatermarkStrategy.noWatermarks(), "name-of-source")
  .map(...)
  .assignTimestampsAndWatermarks(...)

this will help establish this anti-pattern as the normal way of doing
things.

Most new Flink users are using a KafkaSource with a noWatermarks strategy
and a SimpleStringSchema, followed by a map that does the real
deserialization, followed by the real watermarking -- because they aren't
seeing examples that teach how these interfaces are meant to be used.

When we redo the sources used in training exercises, I want to avoid these
pitfalls.

David

On Mon, Jun 6, 2022 at 9:12 AM Konstantin Knauf <kna...@apache.org> wrote:

> Hi everyone,
>
> very interesting thread. The proposal for deprecation seems to have sparked
> a very important discussion. Do we what users struggle with specifically?
>
> Speaking for myself, when I upgrade flink-faker to the new Source API an
> unbounded version of the NumberSequenceSource would have been all I needed,
> but that's just the data generator use case. I think, that one could be
> solved quite easily. David, can you elaborate why you need watermark
> generation in the source for your data generators?
>
> Cheers,
>
> Konstantin
>
>
>
>
>
> Am So., 5. Juni 2022 um 17:48 Uhr schrieb Piotr Nowojski <
> pnowoj...@apache.org>:
>
> > Also +1 to what David has written. But it doesn't mean we should be
> waiting
> > indefinitely to deprecate SourceFunction.
> >
> > Best,
> > Piotrek
> >
> > niedz., 5 cze 2022 o 16:46 Jark Wu <imj...@gmail.com> napisał(a):
> >
> > > +1 to David's point.
> > >
> > > Usually, when we deprecate some interfaces, we should point users to
> use
> > > the recommended alternatives.
> > > However, implementing the new Source interface for some simple
> scenarios
> > is
> > > too challenging and complex.
> > > We also found it isn't easy to push the internal connector to upgrade
> to
> > > the new Source because
> > > "FLIP-27 are hard to understand, while SourceFunction is easy".
> > >
> > > +1 to make implementing a simple Source easier before deprecating
> > > SourceFunction.
> > >
> > > Best,
> > > Jark
> > >
> > >
> > > On Sun, 5 Jun 2022 at 07:29, Jingsong Lee <lzljs3620...@apache.org>
> > wrote:
> > >
> > > > +1 to David and Ingo.
> > > >
> > > > Before deprecate and remove SourceFunction, we should have some
> easier
> > > APIs
> > > > to wrap new Source, the cost to write a new Source is too high now.
> > > >
> > > >
> > > >
> > > > Ingo Bürk <airbla...@apache.org>于2022年6月5日 周日05:32写道:
> > > >
> > > > > I +1 everything David said. The new Source API raised the
> complexity
> > > > > significantly. It's great to have such a rich, powerful API that
> can
> > do
> > > > > everything, but in the process we lost the ability to onboard
> people
> > to
> > > > > the APIs.
> > > > >
> > > > >
> > > > > Best
> > > > > Ingo
> > > > >
> > > > > On 04.06.22 21:21, David Anderson wrote:
> > > > > > I'm in favor of this, but I think we need to make it easier to
> > > > implement
> > > > > > data generators and test sources. As things stand in 1.15, unless
> > you
> > > > can
> > > > > > be satisfied with using a NumberSequenceSource followed by a map,
> > > > things
> > > > > > get quite complicated. I looked into reworking the data
> generators
> > > used
> > > > > in
> > > > > > the training exercises, and got discouraged by the amount of work
> > > > > involved.
> > > > > > (The sources used in the training want to be unbounded, and need
> > > > > > watermarking in the sources, which means that using
> > > > NumberSequenceSource
> > > > > > isn't an option.)
> > > > > >
> > > > > > I think the proposed deprecation will be better received if it
> can
> > be
> > > > > > accompanied by something that makes implementing a simple Source
> > > easier
> > > > > > than it is now. People are continuing to implement new
> > > SourceFunctions
> > > > > > because the interfaces defined by FLIP-27 are hard to understand,
> > > while
> > > > > > SourceFunction is easy. Alex, I believe you were looking into
> > > > > implementing
> > > > > > an easier-to-use building block that could be used in situations
> > like
> > > > > this.
> > > > > > Can we get something like that in place first?
> > > > > >
> > > > > > David
> > > > > >
> > > > > > On Fri, Jun 3, 2022 at 4:52 PM Jing Ge <j...@ververica.com>
> wrote:
> > > > > >
> > > > > >> Hi,
> > > > > >>
> > > > > >> Thanks Alex for driving this!
> > > > > >>
> > > > > >> +1 To give the Flink developers, especially Connector developers
> > the
> > > > > clear
> > > > > >> signal that the new Source API is recommended according to
> > FLIP-27,
> > > we
> > > > > >> should mark them as deprecated.
> > > > > >>
> > > > > >> There are some open questions to discuss:
> > > > > >>
> > > > > >> 1. Do we need to mark all subinterfaces/subclasses as
> deprecated?
> > > e.g.
> > > > > >> FromElementsFunction, etc. there are many. What are the
> > > replacements?
> > > > > >> 2. Do we need to mark all subclasses that have replacement as
> > > > > deprecated?
> > > > > >> e.g. ExternallyInducedSource whose replacement class, if I am
> not
> > > > > mistaken,
> > > > > >> ExternallyInducedSourceReader is @Experimental
> > > > > >> 3. Do we need to mark all related test utility classes as
> > > deprecated?
> > > > > >>
> > > > > >> I think it might make sense to create an umbrella ticket to
> cover
> > > all
> > > > of
> > > > > >> these with the following process:
> > > > > >>
> > > > > >> 1. Mark SourceFunction as deprecated asap.
> > > > > >> 2. Mark subinterfaces and subclasses as deprecated, if there are
> > > > > graduated
> > > > > >> replacements. Good example is that KafkaSource replaced
> > > KafkaConsumer
> > > > > which
> > > > > >> has been marked as deprecated.
> > > > > >> 3. Do not mark subinterfaces and subclasses as deprecated, if
> > > > > replacement
> > > > > >> classes are still experimental, check if it is time to graduate
> > > them.
> > > > > After
> > > > > >> graduation, go to step 2. It might take a while for graduation.
> > > > > >> 4. Do not mark subinterfaces and subclasses as deprecated, if
> the
> > > > > >> replacement classes are experimental and are too young to
> > graduate.
> > > We
> > > > > have
> > > > > >> to wait. But in this case we could create new tickets under the
> > > > umbrella
> > > > > >> ticket.
> > > > > >> 5. Do not mark subinterfaces and subclasses as deprecated, if
> > there
> > > is
> > > > > no
> > > > > >> replacement at all. We have to create new tickets and wait until
> > the
> > > > new
> > > > > >> implementation has been done and graduated. It will take a
> longer
> > > > time,
> > > > > >> roughly 1,5 years.
> > > > > >> 6. For test classes, we could follow the same rule. But I think
> > for
> > > > some
> > > > > >> cases, we could consider doing the replacement directly without
> > > going
> > > > > >> through the deprecation phase.
> > > > > >>
> > > > > >> When we look back on all of these, we can realize it is a big
> epic
> > > > (even
> > > > > >> bigger than an epic). It needs someone to drive it and keep
> focus
> > on
> > > > it
> > > > > >> continuously with support from the community and push the
> > > development
> > > > > >> towards the new Source API of FLIP-27.
> > > > > >>
> > > > > >> If we could have consensus for this,  Alex and I could create
> the
> > > > > umbrella
> > > > > >> ticket to kick it off.
> > > > > >>
> > > > > >> Best regards,
> > > > > >> Jing
> > > > > >>
> > > > > >>
> > > > > >> On Fri, Jun 3, 2022 at 3:54 PM Alexander Fedulov <
> > > > > alexan...@ververica.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >>> Hi everyone,
> > > > > >>>
> > > > > >>> I would like to start the discussion about marking
> > > > SourceFunction-based
> > > > > >>> interfaces as deprecated. With the FLIP-27 APIs becoming the
> new
> > > > > >> standard,
> > > > > >>> the old ones have to be eventually phased out. Although this
> > state
> > > is
> > > > > >> well
> > > > > >>> known within the community and no new connectors based on the
> old
> > > > > >>> interfaces can be accepted into the project, the footprint of
> > > > > >>> SourceFunction in the user code still keeps growing (primarily
> > for
> > > > data
> > > > > >>> generators and test utilities). I believe it is best to mark
> > > > > >> SourceFunction
> > > > > >>> as deprecated as soon as possible. What do you think?
> > > > > >>>
> > > > > >>> Best,
> > > > > >>> Alexander Fedulov
> > > > > >>>
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
> --
> https://twitter.com/snntrable
> https://github.com/knaufk
>

Reply via email to