Let us assume that I want to perform some kind of aggregation in specified
time windows (e.g. tumbling window of 1 minute) and my aggregation operation
is associative. Wouldn't it be possible to represent windowAll in runtime as
/parallelism + 1/ operator instances where /parallelism/ number of operators
compute partial aggregates and then partial results are merged into one in
the last instance of the operator by using merge function that is present in
AggregateFunction function.

Basically I would like to compute single aggregated value for all events in
given time window and aggregation operation itself can be parallelized. 

For example i could have mapped stream with .map operation that has
parallelism 4, then each map operator instance would pass 1/4 of events to
adjacent instance of windowAll operator that would compute desired aggregate
over subset of events. When the window is closed all partial states would be
transferred to single windowAll merging operator.

Are there any plans to support such situations/is it possible to somehow
implement such operator in current version of Flink. 

Also there is a note in windowAll java-doc about possible parallelism but I
don't know how relevant it is to my case:

Note: This operation can be inherently non-parallel since all elements have
to pass through the same operator instance. (Only for special cases, such as
aligned time windows is it possible to perform this operation in parallel).



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Associative-operation-windowAll-possible-parallelism-tp14187.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.

Reply via email to