What would the processBarrier method do? On Tuesday, May 5, 2015, Aljoscha Krettek <aljos...@apache.org> wrote:
> I'm using the term punctuation and watermark interchangeably here > because for practical purposes they do the same thing. I'm not sure > what you meant with your comment about those. > > For the Operator interface I'm thinking about something like this: > > abstract class OneInputStreamOperator<IN, OUT, F extends Function> { > public processElement(IN element); > public processBarrier(...); > public processPunctuation/lowWatermark(...): > } > > The operator also has access to the TaskContext and ExecutionConfig > and Serializers. The operator would emit values using an emit() method > or the Collector interface, not sure about that yet. > > On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <gyf...@apache.org > <javascript:;>> wrote: > > I think this a good idea in general. I would try to minimize the methods > we > > include and make the ones that we keep very concrete. For instance i > would > > not have the receive barrier method as that is handled on a totally > > different level already. And instead of punctuation I would directly add > a > > method to work on watermarks. > > > > On Tuesday, May 5, 2015, Aljoscha Krettek <aljos...@apache.org > <javascript:;>> wrote: > > > >> What do you mean by "losing iterations"? > >> > >> For the pros and cons: > >> > >> Cons: I can't think of any, since most of the operators are chainable > >> already and already behave like a collector. > >> > >> Pros: > >> - Unified model for operators, chainable operators don't have to > >> worry about input iterators and the collect interface. > >> - Enables features that we want in the future, such as barriers and > >> punctuations because they don't work with the > >> simple Collector interface. > >> - The while-loop is moved outside of the operators, now the Task (the > >> thing that runs Operators) can control the flow of data better and > >> deal with > >> stuff like barriers and punctuations. If we want to keep the > >> main-loop inside each operator, then they all have to manage input > >> readers and inline events manually. > >> > >> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <ktzou...@apache.org > <javascript:;> > >> <javascript:;>> wrote: > >> > Can you give us a rough idea of the pros and cons? Do we lose some > >> > functionality by getting rid of iterations? > >> > > >> > Kostas > >> > > >> > On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <aljos...@apache.org > <javascript:;> > >> <javascript:;>> > >> > wrote: > >> > > >> >> Hi Folks, > >> >> while working on introducing source-assigned timestamps into > streaming > >> >> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about > how > >> >> the punctuations (low watermarks) can be pushed through the system. > >> >> The problem is, that operators can have two ways of getting input: 1. > >> >> They read directly from input iterators, and 2. They act as a > >> >> Collector and get elements via collect() from the previous operator > in > >> >> a chain. > >> >> > >> >> This makes it hard to push things through a chain that are not > >> >> elements, such as barriers and/or punctuations. > >> >> > >> >> I propose to change all streaming operators to be push based, with a > >> >> slightly improved interface: In addition to collect(), which I would > >> >> call receiveElement() I would add receivePunctuation() and > >> >> receiveBarrier(). The first operator in the chain would also get data > >> >> from the outside invokable that reads from the input iterator and > >> >> calls receiveElement() for the first operator in a chain. > >> >> > >> >> What do you think? I would of course be willing to implement this > >> myself. > >> >> > >> >> Cheers, > >> >> Aljoscha > >> >> > >> >