There is no "lateral communication" right now. Typical pattern is to break it up in two operators that communicate in an all-to-all fashion.
On Thu, Jun 4, 2015 at 11:52 AM, Gyula Fóra <gyula.f...@gmail.com> wrote: > I am simply thinking about the best way to send data to different subtasks > of the same operator. > > Can we go back to the original question? :D > > Stephan Ewen <se...@apache.org> ezt írta (időpont: 2015. jún. 3., Sze, > 23:45): > > > I think that it may be a bit pre-mature to invest heavily into the > parallel > > delta-policy windows just yet. > > We have not even answered all questions on the key-local delta windows > yet: > > > > - How does it behave with non-monotonous changes? What does the delta > > refer to, the max interval in the window, the interval to the earliest > > element. The max difference between two consecutive elements? > > > > - What about the order of records? Are deltas even interesting when > > records come in arbitrary order? What about the predictability of > recovery > > runs? > > > > > > I would assume that a consistent version of the key-local delta windows > > will get us a long way, use-case wise. > > > > Let's learn more about how users use these policies in the "simple" case. > > Because that will impact the protocol for global coordination (for > examplea > > concerning order and relative to what element are the deltas computed, > the > > first or the min). Otherwise we invest a lot of effort into something > where > > we have not yet a clear understanding about how we actually want it to > > behave, exactly. > > > > What do you think? > > > > > > > > > > On Wed, Jun 3, 2015 at 2:14 PM, Gyula Fóra <gyula.f...@gmail.com> wrote: > > > > > I am talking of course about global delta windows. On the full stream > not > > > on a partition. Delta windows per partition happens as you said > currently > > > as well. > > > > > > On Wednesday, June 3, 2015, Aljoscha Krettek <aljos...@apache.org> > > wrote: > > > > > > > Yes, this is obvious, but if we simply partition the data on the > > > > attribute that we use for the delta policy this can be done purely on > > > > one machine. No need for complex communication/synchronization. > > > > > > > > On Wed, Jun 3, 2015 at 1:32 PM, Gyula Fóra <gyula.f...@gmail.com > > > > <javascript:;>> wrote: > > > > > Yes, we define a delta function from the first element to the last > > > > element > > > > > in a window. Now let's discretize the stream using this semantics > in > > > > > parallel. > > > > > > > > > > Aljoscha Krettek <aljos...@apache.org <javascript:;>> ezt írta > > > > (időpont: 2015. jún. 3., > > > > > Sze, 12:20): > > > > > > > > > >> Ah ok. And by distributed you mean that the element that starts > the > > > > >> window can be processed on a different machine than the element > that > > > > >> finishes the window? > > > > >> > > > > >> On Wed, Jun 3, 2015 at 12:11 PM, Gyula Fóra <gyula.f...@gmail.com > > > > <javascript:;>> wrote: > > > > >> > This is not connected to the current implementation. So lets not > > > talk > > > > >> about > > > > >> > that. > > > > >> > > > > > >> > This is about theoretical limits to support distributed delta > > > policies > > > > >> > which has far reaching implications for the windowing policies > one > > > can > > > > >> > implement in a prallel way. > > > > >> > > > > > >> > But you are welcome to throw in any constructive ideas :) > > > > >> > On Wed, Jun 3, 2015 at 11:49 AM Aljoscha Krettek < > > > aljos...@apache.org > > > > <javascript:;>> > > > > >> > wrote: > > > > >> > > > > > >> >> Part of the reason for my question is this: > > > > >> >> https://issues.apache.org/jira/browse/FLINK-1967. Especially > my > > > > latest > > > > >> >> comment there. If we want this, I think we have to overhaul the > > > > >> >> windowing system anyways and then it doesn't make sense to > > explore > > > > >> >> complicated workarounds for the current system. > > > > >> >> > > > > >> >> On Wed, Jun 3, 2015 at 11:07 AM, Gyula Fóra < > > gyula.f...@gmail.com > > > > <javascript:;>> > > > > >> wrote: > > > > >> >> > There are simple ways of implementing it in a non-distributed > > or > > > > >> >> > inconsistent fashion. > > > > >> >> > On Wed, Jun 3, 2015 at 8:55 AM Aljoscha Krettek < > > > > aljos...@apache.org <javascript:;>> > > > > >> >> wrote: > > > > >> >> > > > > > >> >> >> This already sounds awfully complicated. Is there no other > way > > > to > > > > >> >> >> implement the delta windows? > > > > >> >> >> > > > > >> >> >> On Wed, Jun 3, 2015 at 7:52 AM, Gyula Fóra < > > > gyula.f...@gmail.com > > > > <javascript:;>> > > > > >> >> wrote: > > > > >> >> >> > Hi Ufuk, > > > > >> >> >> > > > > > >> >> >> > In the concrete use case I have in mind I only want to > send > > > > events > > > > >> to > > > > >> >> >> > another subtask of the same task vertex. > > > > >> >> >> > > > > > >> >> >> > Specifically: if we want to do distributed delta based > > windows > > > > we > > > > >> >> need to > > > > >> >> >> > send after every trigger the element that has triggered > the > > > > current > > > > >> >> >> window. > > > > >> >> >> > So practically I want to broadcast some event regularly to > > all > > > > >> >> subtasks > > > > >> >> >> of > > > > >> >> >> > the same operator. > > > > >> >> >> > > > > > >> >> >> > In this case the operators would wait until they receive > > this > > > > event > > > > >> >> so we > > > > >> >> >> > need to make sure that this event sending is not blocked > by > > > the > > > > >> actual > > > > >> >> >> > records. > > > > >> >> >> > > > > > >> >> >> > Gyula > > > > >> >> >> > > > > > >> >> >> > On Tuesday, June 2, 2015, Ufuk Celebi <u...@apache.org > > > > <javascript:;>> wrote: > > > > >> >> >> > > > > > >> >> >> >> > > > > >> >> >> >> On 02 Jun 2015, at 22:45, Gyula Fóra <gyf...@apache.org > > > > <javascript:;> > > > > >> >> <javascript:;>> > > > > >> >> >> >> wrote: > > > > >> >> >> >> > I am wondering, what is the suggested way to send some > > > events > > > > >> >> >> directly to > > > > >> >> >> >> > another parallel instance in a flink job? For example > > from > > > > one > > > > >> >> mapper > > > > >> >> >> to > > > > >> >> >> >> > another mapper (of the same operator). > > > > >> >> >> >> > > > > > >> >> >> >> > Do we have any internal support for this? The first > thing > > > > that > > > > >> we > > > > >> >> >> thought > > > > >> >> >> >> > of is iterations but that is clearly an overkill. > > > > >> >> >> >> > > > > >> >> >> >> There is no support for this at the moment. Any parallel > > > > instance? > > > > >> >> Or a > > > > >> >> >> >> subtask instance of the same task? > > > > >> >> >> >> > > > > >> >> >> >> Can you provide more input on the use case? It is > certainly > > > > >> possible > > > > >> >> to > > > > >> >> >> >> add support for this. > > > > >> >> >> >> > > > > >> >> >> >> If the events don't need to be inline with the records, > we > > > can > > > > >> easily > > > > >> >> >> >> setup the TaskEventDispatcher as a separate actor (or > > extend > > > > the > > > > >> task > > > > >> >> >> >> manager) to process both backwards flowing events and in > > > > general > > > > >> any > > > > >> >> >> events > > > > >> >> >> >> that don't need to be inline with the records. The task > > > > deployment > > > > >> >> >> >> descriptors need to be extended with the extra parallel > > > > instance > > > > >> >> >> >> information. > > > > >> >> >> >> > > > > >> >> >> >> – Ufuk > > > > >> >> >> > > > > >> >> > > > > >> > > > > > > > > > >