Good points. If we want to structured loops on streaming we will need to inject
iteration counters. The question is if we really need structured iterations on
plain data streams. Window iterations are must-have on the other hand...
Paris
> On 07 Jul 2015, at 16:43, Kostas Tzoumas wrote:
>
> I
I see. Perhaps more important IMO is defining the semantics of stream loops
with event time.
The reason I asked about nested is that Naiad and other designs used a
multidimensional timestamp to capture loops: (outer loop counter, inner
loop counter, timestamp). I assume that currently making sense
Okay, I am fine with this approach as well I see the advantages. Then we
just need to find a suitable name for marking a "FeedbackPoint" :)
Stephan Ewen ezt írta (időpont: 2015. júl. 7., K, 16:28):
> In Aljoscha's approach, we would need a special mutable stream. We could do
> it like this:
>
>
In Aljoscha's approach, we would need a special mutable stream. We could do
it like this:
DataStream source = ...
FeedbackPoint pt = source.createFeedbackPoint();
DataStream mapper = pt .map(noOpMapper)
DataStream feedback = mapper.filter(...)
pt .addFeedbacl(feedback)
It is basically like the
@Aljoscha:
Yes, thats basically my point as well. This is what happens now too but we
give this mutable datastream a special name : IterativeDataStream
This can be handled in very different ways through the api, the goal would
be to make something easy to use. I am fine with what we have now becau
I think it could work if we allowed a DataStream to be unioned after
creation. For example:
DataStream source = ..
DataStream mapper = source.map(noOpMapper)
DataStream feedback = mapper.filter(...)
source.union(feedback)
This would basically mean that a DataStream is mutable and can be extended
@Kostas:
This new API is I believe equivalent in expressivity with the current one.
We can define nested loops now as well.
And I also don't see nested loops much worse generally than simple loops.
Gyula Fóra ezt írta (időpont: 2015. júl. 7., K,
16:14):
> Sorry Stephan I meant it slightly differ
Sorry Stephan I meant it slightly differently, I see your point:
DataStream source = ...
SingleInputOperator mapper = source.map(...)
mapper.addInput()
So the add input would be a method of the operator not the stream.
Aljoscha Krettek ezt írta (időpont: 2015. júl. 7., K,
16:12):
> I think thi
+1 for rethinking the iterations in DataStream
However, wouldn't this proposal allow the definition of arbitrary loops
(e.g., nested loops) that are not well behaved afaik?
On Tue, Jul 7, 2015 at 4:12 PM, Stephan Ewen wrote:
> I see that the newly proposed API makes some things easier to define
I think this would be good yes. I was just about to open an Issue for
changing the Streaming Iteration API. :D
Then we should also make the implementation very straightforward and
simple, right now, the implementation of the iterations is all over the
place.
On Tue, 7 Jul 2015 at 15:57 Gyula Fóra
I see that the newly proposed API makes some things easier to define.
There is one source of confusion, though, in my opinion:
The new API suggests that the data stream actually refers to the operator
that created it.
The "DataStream mapper = source.map(noOpMapper)" here refers to the map
operato
Hey,
Along with the suggested changes to the streaming API structure I think we
should also rework the "iteration" api. Currently the iteration api tries
to mimic the syntax of the batch API while the runtime behaviour is quite
different.
What we create instead of iterations is really just cyclic
12 matches
Mail list logo