Re: Flushing the result of a groupReduce to a Sink before all reduces complete

2016-10-28 Thread Paul Wilson
Hi Fabian, We have reworked our execution to remove the group reduce step and replaced it with a map partition and we're seeing data passing more immediately now. Thanks for your quick reply, it was very useful. Regards, Paul On 26 October 2016 at 19:57, Fabian Hueske wrote: > Hi Paul, > > Fl

Re: Flushing the result of a groupReduce to a Sink before all reduces complete

2016-10-26 Thread Fabian Hueske
Hi Paul, Flink pushes the results of operators (including GroupReduce) to the next operator or sink as soon as they are computed. So what you are asking for is actually happening. However, before the GroupReduceFunction can be applied, the whole data is sorted in order to group the data. This step

Flushing the result of a groupReduce to a Sink before all reduces complete

2016-10-26 Thread Paul Wilson
Hi, DataSet API Flink 1.1.3 I have an application where I'd like to perform some mapping before batching the results and passing them to the sink. I'm performing a 'composite' key selection to group the items by their natural key as well as a batch (itemCount / batchSize). When I reduce the batch