Re: [DISCUSS] Behaviour of Streaming Sources

2015-05-11 Thread Stephan Ewen
Thanks, Marton, for the summary. The PollingStreamSource essentially follows the API suggested by Aljoscha, and will probably internally use a backoff sleep time (as Matthias pointed out), so we really have arrived at a mix of techniques ;-) On Mon, May 11, 2015 at 4:12 PM, Márton Balassi wrote:

Re: [DISCUSS] Behaviour of Streaming Sources

2015-05-11 Thread Márton Balassi
We had a conversation with Stephan, Aljoscha, Gyula and Paris and converged on the following outline for the streaming source interface. The question is tricky because we need to coordinate between the actual source computation and triggering the checkpointing of the state of the source. We should

Re: [DISCUSS] Behaviour of Streaming Sources

2015-05-11 Thread Gyula Fóra
I would not go into this direction. Returning lists is messy I think. I would stick with hasNext and Next returning a single element On Mon, May 11, 2015 at 10:20 AM, Aljoscha Krettek wrote: > We could also change next() to return List and say that the method > must not sit and wait but simply r

Re: [DISCUSS] Behaviour of Streaming Sources

2015-05-11 Thread Aljoscha Krettek
We could also change next() to return List and say that the method must not sit and wait but simply return stuff that is available without waiting while also being able to not return anything for the moment. On Fri, May 8, 2015 at 12:05 PM, Matthias J. Sax wrote: > You are right. That is why I po

Re: [DISCUSS] Behaviour of Streaming Sources

2015-05-08 Thread Matthias J. Sax
You are right. That is why I pointed out this already: > -> You could force the UDF to return each time, be disallowing >>> consecutive calls to Collector.out(...). The Storm design would avoid the "NULL-Problem" Aljoscha mentioned, too. -Matthias On 05/08/2015 10:59 AM, Gyula Fóra wrote: >

Re: [DISCUSS] Behaviour of Streaming Sources

2015-05-08 Thread Gyula Fóra
I think the problem with this void next() approach is exactly the way it works: "Using this interface, "next()" can loop internally as long as tuples are available and return if there is (currently) no input." We dont want the user to loop internally in the next because then we have almost the sa

Re: [DISCUSS] Behaviour of Streaming Sources

2015-05-08 Thread Matthias J. Sax
Did you consider the Storm way to handle this? Storm offers a method "void next()" that uses a collector object to emit new tuples. Using this interface, "next()" can loop internally as long as tuples are available and return if there is (currently) no input. What I have seen, people tend to emit

[DISCUSS] Behaviour of Streaming Sources

2015-05-08 Thread Aljoscha Krettek
Hi, in the process of reworking the Streaming Operator model I'm also reworking the sources in order to get rid of the loop in each source. Right now, the interface for sources (SourceFunction) has one method: run(). This is called when the source starts and can just output elements at any time usi