Hi Xu,

Thanks for drafting the FLIP. I have a question regarding the
motivation for the change.

So far, the datastream API doesn't support any relational operations,
and if users want to use joins/groupBy etc., they usually use SQL or
the table API.
With this FLIP, the difference between the APIs becomes more blurry,
and users might be confused about the different join semantics of the
API layers.

I would like to know the arguments behind introducing the statements
to the datastream API against using the table API. It should also be
part of the FLIP to understand better how this change helps Flink
users.

Best,
Fabian


On Mon, Jan 6, 2025 at 9:25 AM Junrui Lee <jrlee....@gmail.com> wrote:
>
> Hi Xu,
>
> Thanks for your work. I have a small question: In JoinExtension, when a
> record is received, is it immediately joined with all the data received
> from the other side, or does it wait until both streams are finished before
> joining? Also, does it work on both bounded and unbounded streams?
>
> Xu Huang <huangxu.wal...@gmail.com> 于2025年1月4日周六 11:50写道:
>
> > Hi Devs,
> >
> > Weijie Guo and I would like to initiate a discussion about FLIP-500:
> > Support Join Extension in DataStream V2 API [1].
> >
> > In relational algebra, Join are used to co-group two datasets and combine
> > the data based on specific conditions. For stream computing systems, the
> > data of the two streams is cached (usually through State) when the Join
> > operation is performed. When data from either stream arrives, it can be
> > matched with data from another stream. Therefore, Join has been widely used
> > in multi-stream aggregate scenarios.
> >
> > To make it easy for users to use Join in DataStream V2, this FLIP will
> > implement the Join extension in DataStream V2.
> >
> > For more details, please refer to FLIP-500 [1]. We look forward to your
> > feedback.
> >
> >
> > Best,
> >
> > Xu Huang
> >
> >
> > [1] https://cwiki.apache.org/confluence/x/ywz0Ew
> >

Reply via email to