Thanks, Marton, for the summary.
The PollingStreamSource essentially follows the API suggested by Aljoscha,
and will probably internally use a backoff sleep time (as Matthias pointed
out), so
we really have arrived at a mix of techniques ;-)
On Mon, May 11, 2015 at 4:12 PM, Márton Balassi
wrote:
We had a conversation with Stephan, Aljoscha, Gyula and Paris and converged
on the following outline for the streaming source interface. The question
is tricky because we need to coordinate between the actual source
computation and triggering the checkpointing of the state of the source.
We should
I would not go into this direction. Returning lists is messy I think. I
would stick with hasNext and Next returning a single element
On Mon, May 11, 2015 at 10:20 AM, Aljoscha Krettek
wrote:
> We could also change next() to return List and say that the method
> must not sit and wait but simply r
We could also change next() to return List and say that the method
must not sit and wait but simply return stuff that is available
without waiting while also being able to not return anything for the
moment.
On Fri, May 8, 2015 at 12:05 PM, Matthias J. Sax
wrote:
> You are right. That is why I po
You are right. That is why I pointed out this already:
> -> You could force the UDF to return each time, be disallowing
>>> consecutive calls to Collector.out(...).
The Storm design would avoid the "NULL-Problem" Aljoscha mentioned, too.
-Matthias
On 05/08/2015 10:59 AM, Gyula Fóra wrote:
>
I think the problem with this void next() approach is exactly the way it
works:
"Using this interface, "next()" can loop internally as long
as tuples are available and return if there is (currently) no input."
We dont want the user to loop internally in the next because then we have
almost the sa
Did you consider the Storm way to handle this?
Storm offers a method "void next()" that uses a collector object to emit
new tuples. Using this interface, "next()" can loop internally as long
as tuples are available and return if there is (currently) no input.
What I have seen, people tend to emit
Hi,
in the process of reworking the Streaming Operator model I'm also
reworking the sources in order to get rid of the loop in each source.
Right now, the interface for sources (SourceFunction) has one method:
run(). This is called when the source starts and can just output
elements at any time usi