Hi, the ReduceFunction holds the last emitted record as state. When a new record arrives, it reduces the new record and last emitted record, updates its state, and emits the new result. Therefore, a ReduceFunction emits one output record for each input record, i.e., it is triggered for each input record. The output of the ReduceFunction should be treated as a stream of updates not of final results.
Best, Fabian 2018-01-03 18:46 GMT+01:00 M Singh <mans2si...@yahoo.com>: > Hi Stefan: > > Thanks for your response. > > A follow up question - In a streaming environment, we invoke the operation > reduce and then output results to the sink. Does this mean reduce will be > executed once on every trigger per partition with all the items in each > partition ? > > Thanks > > > On Wednesday, January 3, 2018 2:46 AM, Stefan Richter < > s.rich...@data-artisans.com> wrote: > > > Hi, > > I would interpret this as: the reduce produces an output for every new > reduce call, emitting the updated value. There is no need for a window > because it kicks in on every single invocation. > > Best, > Stefan > > > Am 31.12.2017 um 22:28 schrieb M Singh <mans2si...@yahoo.com>: > > Hi: > > Apache Flink documentation (https://ci.apache.org/ > projects/flink/flink-docs-release-1.4/dev/stream/operators/index.html) > indicates that a reduce function on a KeyedStream as follows: > > A "rolling" reduce on a keyed data stream. Combines the current element > with the last reduced value and emits the new value. > > A reduce function that creates a stream of partial sums: > > keyedStream.reduce(new ReduceFunction<Integer>() { > @Override > public Integer reduce(Integer value1, Integer value2) > throws Exception { > return value1 + value2; > }}); > > > The KeyedStream is not windowed, so when does the reduce function kick in > to produce the DataStream (ie, is there a default time out, or collection > size that triggers it, since we have not defined any window on it). > > Thanks > > Mans > > > > >