I think it looks like Beam rather than Hadoop :)

What Stephan meant was that he wanted a dedicated output method in the
ProcessWindowFunction. I agree with Aljoscha that we shouldn't expose
the collector.

On Tue, Jul 19, 2016 at 10:45 PM, Aljoscha Krettek <aljos...@apache.org> wrote:
> You mean keep the Collector? I don't like that one because it has the
> close() method that should never be called by the user.
>
> We can keep it, though, because all the other user function interfaces also
> expose it to the user.
>
> On Tue, 19 Jul 2016 at 15:22 Stephan Ewen <se...@apache.org> wrote:
>
>> I would actually make the output a separate parameter as well. Pretty much
>> like the old variant, only replacing the "Window" parameter by the context
>> (which contains everything about the window).
>> It could also be called "WindowInvocationContext" or so.
>>
>> The current variant looks too Hadoop to me ;-) Everything done on the
>> context object, and messy mocking when creating tests.
>>
>> On Mon, Jul 18, 2016 at 6:42 PM, Radu Tudoran <radu.tudo...@huawei.com>
>> wrote:
>>
>> > Hi,
>> >
>> > Sorry - I made a mistake - I was thinking of getting access to the
>> > collection (mist-read :) collector) of events in the window buffer  in
>> > order to be able to delete/evict some of them which are not necessary the
>> > last ones.
>> >
>> >
>> > Radu
>> >
>> > -----Original Message-----
>> > From: Aljoscha Krettek [mailto:aljos...@apache.org]
>> > Sent: Monday, July 18, 2016 5:54 PM
>> > To: dev@flink.apache.org
>> > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
>> >
>> > What about the collector? This is only used for emitting elements to the
>> > downstream operation.
>> >
>> > On Mon, 18 Jul 2016 at 17:52 Radu Tudoran <radu.tudo...@huawei.com>
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > I think it looks good and most importantly is that we can extend it in
>> > > the directions discussed so far.
>> > >
>> > > One question though regarding the Collector - are we going to be able
>> > > to delete random elements from the list if this is not exposed as a
>> > > collection, at least to the evictor? If not, how are we going to
>> > > extend in the future to cover this case?
>> > >
>> > > Regarding the ordering - I also observed that there are situations
>> > > where elements do not have a logical order. One example is if you have
>> > > high rates of the events. Nevertheless, even if now is not the time
>> > > for this, I think in the future we can imagine having also some data
>> > > structures that offer some ordering. It can save some computation
>> > > efforts later in the functions for some use cases.
>> > >
>> > >
>> > > Best regards,
>> > >
>> > >
>> > > -----Original Message-----
>> > > From: Aljoscha Krettek [mailto:aljos...@apache.org]
>> > > Sent: Monday, July 18, 2016 3:45 PM
>> > > To: dev@flink.apache.org
>> > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
>> > >
>> > > I incorporated the changes. The proposed interface of
>> > > ProcessWindowFunction is now this:
>> > >
>> > > public abstract class ProcessWindowFunction <IN, OUT, KEY, W extends
>> > > Window> implements Function {
>> > >
>> > >     public abstract void process(KEY key, Iterable<IN> elements,
>> > > Context
>> > > ctx) throws Exception;
>> > >
>> > >     public abstract class Context {
>> > >         public abstract W window();
>> > >         public abstract void output(OUT value);
>> > >     }
>> > > }
>> > >
>> > > I'm proposing to not expose Collector anymore because it has the
>> > > close() method that should not be called by users. Having the output()
>> > > call directly on the context should work just as well.
>> > >
>> > > Also, I marked the "adding a firing reason" and "adding firing
>> > > counter" as future work that are only examples of stuff that can be
>> > > implemented on top of the new interface. Initially, this will provide
>> > > exactly the same features as the old API but be extensible. I did this
>> > > to not make the scope of this proposal to big because Radu also
>> > > suggested more changes and each of them should be covered in a separate
>> > design doc or FLIP.
>> > >
>> > > @Radu: On the different buffer types. I think this would be very
>> tricky.
>> > > Right now, people should also not rely on the fact that elements are
>> > > "FIFO". Some state backends might keep the elements in a different
>> > > order and when you have merging windows/session windows the order of
>> > > the elements will also not be preserved.
>> > >
>> > > Cheers,
>> > > Aljoscha
>> > >
>> > > On Wed, 13 Jul 2016 at 18:40 Radu Tudoran <radu.tudo...@huawei.com>
>> > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > If it is to extend the Context to pass more information between the
>> > > > stages of processing a window (triggering -> process -> eviction),
>> > > > why not adding also a "EvictionInfo"? I think this might actually
>> > > > help with the issues discussed in the tread related to the eviction
>> > policy.
>> > > > I could imagine using this parameter to pass the conditions, from
>> > > > the processing stage to the evictor, about what events to be
>> > eliminated.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > public abstract class Context {
>> > > >
>> > > >    public abstract EvictionInfo evictionInfo();
>> > > >
>> > > > ...
>> > > >
>> > > >
>> > > >    public abstract KEY key();
>> > > >
>> > > >    public abstract W window();
>> > > >
>> > > >    public abstract int id();
>> > > >
>> > > >    public abstract FiringInfo firingInfo();
>> > > >
>> > > >    public abstract Iterable<IN> elements();
>> > > >
>> > > >    public abstract void output(OUT value);
>> > > >
>> > > > }
>> > > >
>> > > >
>> > > > Also on a slightly unrelated issue - how hard it would be to
>> > > > introduce different types of buffers for the windows. Currently the
>> > > > existing one is behaving (when under processing) similar with a FIFO
>> > > > queue (in the sense that you need to start from beginning, from the
>> > oldest element).
>> > > > How about enabling for example also LIFO behavior (start iterating
>> > > > through the list from the most recent element). As in the source
>> > > > queues or stacks are not actually used, perhaps we can just pass
>> > > > policies to the iterator - or have custom itrators
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > Dr. Radu Tudoran
>> > > > Research Engineer - Big Data Expert
>> > > > IT R&D Division
>> > > >
>> > > >
>> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Center
>> > > > Riesstrasse 25, 80992 München
>> > > >
>> > > > E-mail: radu.tudo...@huawei.com
>> > > > Mobile: +49 15209084330
>> > > > Telephone: +49 891588344173
>> > > >
>> > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH Hansaallee 205, 40549
>> > > > Düsseldorf, Germany, www.huawei.com Registered
>> > > > Office: Düsseldorf, Register Court Düsseldorf, HRB 56063, Managing
>> > > > Director: Bo PENG, Wanzhou MENG, Lifang CHEN Sitz der Gesellschaft:
>> > > > Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
>> > > > Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN This e-mail and
>> > > > its attachments contain confidential information from HUAWEI, which
>> > > > is intended only for the person or entity whose address is listed
>> > above.
>> > > > Any use of the information contained herein in any way (including,
>> > > > but not limited to, total or partial disclosure, reproduction, or
>> > > > dissemination) by persons other than the intended recipient(s) is
>> > > > prohibited. If you receive this e-mail in error, please notify the
>> > > > sender by phone or email immediately and delete it!
>> > > >
>> > > >
>> > > > -----Original Message-----
>> > > > From: Aljoscha Krettek [mailto:aljos...@apache.org]
>> > > > Sent: Wednesday, July 13, 2016 2:24 PM
>> > > > To: dev@flink.apache.org
>> > > > Subject: Re: [DISCUSS] FLIP-2 Extending Window Function Metadata
>> > > >
>> > > > Sure, I also thought about this but went for the "extreme" initially.
>> > > > If no-one objects I'll update the doc in a bit.
>> > > >
>> > > > On Wed, 13 Jul 2016 at 14:17 Stephan Ewen <se...@apache.org> wrote:
>> > > >
>> > > > > Thanks for opening this.
>> > > > >
>> > > > > I see the need for having an extensible context object for window
>> > > > > function invocations, but i think hiding every parameter in the
>> > > > > context is a bit unnatural.
>> > > > >
>> > > > > How about having a function "apply(Key, Values, WindowContext,
>> > > > Collector)"
>> > > > > ?
>> > > > > It should be possible to write the straightforward use cases
>> > > > > without accessing the context object.
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Wed, Jul 13, 2016 at 1:56 PM, Aljoscha Krettek
>> > > > > <aljos...@apache.org>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi,
>> > > > > > this is a proposal to introduce a new interface for the window
>> > > > > > function
>> > > > > to
>> > > > > > make it more extensible for the future where we might want to
>> > > > > > provide additional information about why a window fired to the
>> > > > > > user
>> > > > function:
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-2+Extending
>> > > > > +W
>> > > > > in
>> > > > > dow+Function+Metadata
>> > > > > >
>> > > > > > I'd appreciate your thoughts!
>> > > > > >
>> > > > > > Cheers,
>> > > > > > Aljoscha
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>

Reply via email to