There is no watermark handling yet. :D

But this would enable me to do this.

On Tue, May 5, 2015 at 3:39 PM, Paris Carbone <par...@kth.se> wrote:
> I agree with Gyula on this one. Barriers should better not be exposed to the 
> operator. They are system events for state management. Apart from that, 
> watermark handling seems to be on a right track, I like it so far.
>
>> On 05 May 2015, at 15:26, Aljoscha Krettek <aljos...@apache.org> wrote:
>>
>> I don't know, I just put that there because other people are working
>> on the checkpointing/barrier thing. So there would need to be some
>> functionality there at some point.
>>
>> Or maybe it is not required there and can be handled in the
>> StreamTask. Others might know this better than I do right now.
>>
>> On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra <gyula.f...@gmail.com> wrote:
>>> What would the processBarrier method do?
>>>
>>> On Tuesday, May 5, 2015, Aljoscha Krettek <aljos...@apache.org> wrote:
>>>
>>>> I'm using the term punctuation and watermark interchangeably here
>>>> because for practical purposes they do the same thing. I'm not sure
>>>> what you meant with your comment about those.
>>>>
>>>> For the Operator interface I'm thinking about something like this:
>>>>
>>>> abstract class OneInputStreamOperator<IN, OUT, F extends Function>  {
>>>>   public processElement(IN element);
>>>>   public processBarrier(...);
>>>>   public processPunctuation/lowWatermark(...):
>>>> }
>>>>
>>>> The operator also has access to the TaskContext and ExecutionConfig
>>>> and Serializers. The operator would emit values using an emit() method
>>>> or the Collector interface, not sure about that yet.
>>>>
>>>> On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra <gyf...@apache.org
>>>> <javascript:;>> wrote:
>>>>> I think this a good idea in general. I would try to minimize the methods
>>>> we
>>>>> include and make the ones that we keep very concrete. For instance i
>>>> would
>>>>> not have the receive barrier method as that is handled on a totally
>>>>> different level already. And instead of punctuation I would directly add
>>>> a
>>>>> method to work on watermarks.
>>>>>
>>>>> On Tuesday, May 5, 2015, Aljoscha Krettek <aljos...@apache.org
>>>> <javascript:;>> wrote:
>>>>>
>>>>>> What do you mean by "losing iterations"?
>>>>>>
>>>>>> For the pros and cons:
>>>>>>
>>>>>> Cons: I can't think of any, since most of the operators are chainable
>>>>>> already and already behave like a collector.
>>>>>>
>>>>>> Pros:
>>>>>> - Unified model for operators, chainable operators don't have to
>>>>>> worry about input iterators and the collect interface.
>>>>>> - Enables features that we want in the future, such as barriers and
>>>>>> punctuations because they don't work with the
>>>>>>  simple Collector interface.
>>>>>> - The while-loop is moved outside of the operators, now the Task (the
>>>>>> thing that runs Operators) can control the flow of data better and
>>>>>> deal with
>>>>>>  stuff like barriers and punctuations. If we want to keep the
>>>>>> main-loop inside each operator, then they all have to manage input
>>>>>> readers and inline events manually.
>>>>>>
>>>>>> On Tue, May 5, 2015 at 2:41 PM, Kostas Tzoumas <ktzou...@apache.org
>>>> <javascript:;>
>>>>>> <javascript:;>> wrote:
>>>>>>> Can you give us a rough idea of the pros and cons? Do we lose some
>>>>>>> functionality by getting rid of iterations?
>>>>>>>
>>>>>>> Kostas
>>>>>>>
>>>>>>> On Tue, May 5, 2015 at 1:37 PM, Aljoscha Krettek <aljos...@apache.org
>>>> <javascript:;>
>>>>>> <javascript:;>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Folks,
>>>>>>>> while working on introducing source-assigned timestamps into
>>>> streaming
>>>>>>>> (https://issues.apache.org/jira/browse/FLINK-1967) I thought about
>>>> how
>>>>>>>> the punctuations (low watermarks) can be pushed through the system.
>>>>>>>> The problem is, that operators can have two ways of getting input: 1.
>>>>>>>> They read directly from input iterators, and 2. They act as a
>>>>>>>> Collector and get elements via collect() from the previous operator
>>>> in
>>>>>>>> a chain.
>>>>>>>>
>>>>>>>> This makes it hard to push things through a chain that are not
>>>>>>>> elements, such as barriers and/or punctuations.
>>>>>>>>
>>>>>>>> I propose to change all streaming operators to be push based, with a
>>>>>>>> slightly improved interface: In addition to collect(), which I would
>>>>>>>> call receiveElement() I would add receivePunctuation() and
>>>>>>>> receiveBarrier(). The first operator in the chain would also get data
>>>>>>>> from the outside invokable that reads from the input iterator and
>>>>>>>> calls receiveElement() for the first operator in a chain.
>>>>>>>>
>>>>>>>> What do you think? I would of course be willing to implement this
>>>>>> myself.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Aljoscha
>>>>>>>>
>>>>>>
>>>>
>

Reply via email to