I found the solution here:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Regarding-Broadcast-of-datasets-in-streaming-context-td6456.html

2016-11-14 9:41 GMT+01:00 Ufuk Celebi <u...@apache.org>:

> I think this is independent of streaming. If you want to compute the
> aggregate over all keys and data you need to do this in a single task, e.g.
> use a (flat)map with parallelism 1, do the aggregation there and then
> broadcast to downstream operators. Does this make sense or am I overlooking
> something?
>
> On 12 November 2016 at 12:18:04, Felix Neutatz (neut...@googlemail.com)
> wrote:
> > > want to calculate a local aggregation for each task and then
> > combine all these local aggregates to one global aggregate and
> > push this global aggregate to all nodes and continue processing
> > the data stream. If you don't understand my description, I also
> > made some drawings of what I mean: https://docs.google.com/
> presentation/d/13ei6pzhwNKqNShhdNWXqJaYCG1z0Hsrxfy5sRnqun5M/edit?usp=
> sharing
> >
>
>

Reply via email to