Sounds good. Looks like we're on the same page.

Thanks!
Divye

On Wed, Nov 23, 2022 at 2:41 AM Piotr Nowojski <pnowoj...@apache.org> wrote:

> Hi Divye
>
> I think we are mostly on the same page. Just to clarify/rephrase:
>
> > One thing to think about - on EOF “trigger immediately” will mean that
> the
> > asynchronous wait timeout timers will also fire - which is undesirable
>
> I didn't mean to fire all timers immediately in all of the built-in
> operators. Just that each built-in operator can have a hard coded way
> (without a way for users to change it) to handle those timers. Windowed
> operators would trigger the lingering timers (flush outputs),
> AsyncWaitOperator could just ignore them. The same way users could register
> EOF timer handlers in the ProcessFunction as Dawid Wysakowicz proposed, we
> (as flink developers) could use the same mechanism to implement any
> behaviour we want for the built-in operators. There should be no need to
> add any separate mechanism.
>
> Best,
> Piotrek
>
> śr., 23 lis 2022 o 08:21 Divye Kapoor <dkap...@pinterest.com.invalid>
> napisał(a):
>
> > Thanks Yun/Piotrek,
> >
> > Some brief comments inline below.
> >
> > On Tue, Nov 22, 2022 at 1:37 AM Piotr Nowojski <pnowoj...@apache.org>
> > wrote:
> >
> > > Hi,
> > >
> > > All in all I would agree with Dawid's proposal.
> >
> > +1
> >
> > We can add the flexibility
> > > of how to deal with the timers in the low level API via adding a
> handler
> > -
> > > if someone needs to customize it, he will always have a workaround.
> Note
> > > after giving it more thought, I agree that registering some handlers is
> > > better than overloading the register timer method and modifying the
> > timer's
> > > state.
> >
> > +1.
> >
> > >
> > >
> > > At the same time, we can force the most sensible semantic that we think
> > for
> > > the couple of built-in operators, which should be pretty
> straightforward
> > > (either ignore the timers, or fire them at once). I agree there might
> be
> > > some edge cases, that theoretically user might want to wait for the
> timer
> > > to fire naturally, but:
> > > 1. I'm not sure how common in practice this will be. If not at all,
> then
> > > why should we be complicating the API/system?
> >
> > That’s fair.
> > However, the specifics are very important here.
> >
> > One thing to think about - on EOF “trigger immediately” will mean that
> the
> > asynchronous wait timeout timers will also fire - which is undesirable
> > (because they are racing with the last async call). However, the issue is
> > cleanly resolved by waiting for the timer to be canceled when the last
> > event is processed. (“Wait for” case).
> >
> > Ignoring the timer has the least justification. Registering the handler
> as
> > per Dawid’s proposal and having that handler unregister the timers on EOF
> > makes best sense. This solution also unifies the trigger immediately case
> > as that handler can reregister the timers for early termination.
> >
> > The proposal:
> > 1. Operator receives EOF
> > 2. EOF timer handler triggers
> > 3. EOF handler adjusts the registered timers for early trigger or ignore.
> > If wait-for behavior is desired, timers are not changed. This is
> controlled
> > in client code.
> > 4. Operator waits for all timers to drain/trigger. (“Always”). There is
> no
> > special handling for ignore/early trigger.
> > 5. Operator allows job to proceed with shutdown.
> >
> > The only api change needed is an EOF handler.
> > The other agreement we need is that “Wait for” is the desired behavior in
> > processing time and that processing time is fundamentally different from
> > event time in this respect.
> > (I have changed my thinking since the last mail).
> >
> > 2. We can always expand the API in the future, and let the user override
> > > the default built-in behaviour of the operators via some setter on the
> > > stream transformation (`SingleOutputStreamOperator`), or via some
> custom
> > > API DSL style in each of the operators separately.
> >
> >
> > This is not required. See above.
> >
> >
> > >
> > > Re forcing the same semantics for processing time timers as for event
> > time
> > > ones - this is tempting, but indeed I see a possibility that users need
> > to
> > > adhere to some external constraints when using processing time.
> >
> > +1. As above, we should consider the 2 cases fundamentally different in
> > this area.
> >
> > Re: Yun -
> >
> > >  b) Another issue is that what if users use timers with different
> > > > termination actions in the same
> > > >  operator / UDF? For example, users use some kind of timeout (like
> > throws
> > > > exception if some thing
> > > >  not happen after some other thing), and also some kind of window
> > > > aggregation logic. In this case,
> > > >  without additional tags, users might not be able to distinguish
> which
> > > > timer should be canceled and
> > > >  which time should be triggered ?
> >
> > as above. The EOF handler makes the choice.
> >
> > >
> > > > 4. How could these scenarios adjust their APIs ?
> > > >  From the current listed scenarios, I'm more tend to that as @Dawid
> > > > pointed out, there might be only
> > > >  one expected behavior for each scenario, thus it does not seems to
> > need
> > > > to allow users to adjust the
> > > >  behavior. Thus @Divye may I have a double confirmation currently do
> we
> > > > have explicit scenarios that
> > > >  is expected to change the different behaviors for the same scenario?
> >
> > Wait-for behavior is probably the only expected behavior and any
> > alterations should be from the EOF handler managing the registered
> timers.
> >
> > >
> > > >  Besides @Divye from the listed scenarios, I have another concern for
> > > > global configuration is that for
> > > >  one job, different operators seems to still might have different
> > > expected
> > > > behaviors. For example, A
> > > >  job using both Window operator and AsyncWaitOperator might have
> > > different
> > > > requirements for timers
> > > >  on termination?
> >
> >
> > Thank you for raising this case. This changed my thinking. Based on your
> > point, we should try and align on the “Wait-for” with EOF handler
> proposal.
> > I’m withdrawing the “single-runtime-config” proposal.
> >
> > Best,
> > Divye
> >
>

Reply via email to