Hi Aljoscha,

Thank you for the proposal and for bringing up again this discussion.

Regarding the implementation aspect,I would say the first way could
be easier/faster to implement but it could add some overhead when
dealing with multiple side inputs through the current 2-streams union
transform. I tried the second option myself as it has less overhead
but then the outcome was something close to a N-ary operator consuming
first each side input while buffering the main one.
Therefore, I would choose the third option as it is more generic
and might help also in other scenarios, although its implementation
requires more effort.
I also agree with Gyula, I think the user should be allowed to define the
condition that determines when a side input is ready, e.g., load the side
input first, incrementally update the side input.

Best,
Ventura






This message, for the D. Lgs n. 196/2003 (Privacy Code), may contain
confidential and/or privileged information. If you are not the addressee or
authorized to receive this for the addressee, you must not use, copy,
disclose or take any action based on this message or any information
herein. If you have received this message in error, please advise the
sender immediately by reply e-mail and delete this message. Thank you for
your cooperation.

On Mon, Mar 6, 2017 at 3:50 PM, Gyula Fóra <gyula.f...@gmail.com> wrote:

> Hi Aljoscha,
>
> Thank you for the nice proposal!
>
> I think it would make sense to allow user's to affect the readiness of the
> side input. I think making it ready when the first element arrives is only
> slightly better then making it always ready from usability perspective. For
> instance if I am joining against a static data set I want to wait for the
> whole set before making it ready. This could be exposed as a user defined
> condition that could also recognize bounded inputs maybe.
>
> Maybe we could also add an aggregating (merging) side input type, that
> could work as a broadcast state.
>
> What do you think?
>
> Gyula
>
> Aljoscha Krettek <aljos...@apache.org> ezt írta (időpont: 2017. márc. 6.,
> H, 15:18):
>
> > Hi Folks,
> >
> > I would like to finally agree on a plan for implementing side inputs in
> > Flink. There has already been an attempt to come to consensus [1], which
> > resulted in two design documents. I tried to consolidate those two and
> > also added a section about implementation plans. This is the resulting
> > FLIP:
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-
> 17+Side+Inputs+for+DataStream+API
> >
> >
> > In terms of semantics I tried to go with the minimal viable solution.
> > The part that needs discussing is how we want to implement this. I
> > outlined three possible implementation plans in the FLIP but what it
> > boils down to is that we need to introduce some way of getting several
> > inputs into an operator/task.
> >
> >
> > Please have a look at the doc and let us know what you think.
> >
> >
> >
> > Best,
> >
> > Aljoscha
> >
> >
> >
> > [1]
> > https://lists.apache.org/thread.html/797df0ba066151b77c7951fd7d603a
> 8afd7023920d0607a0c6337db3@1462181294@%3Cdev.flink.apache.org%3E
> >
>

Reply via email to