I'd like to start a discussion of FLIP-491: BundledAggregateFunction for batched aggregation [1]
This feature proposes adding a new interface BundledAggregateFunction that can be implemented by AggregateFunction UDFs. This allows the use of a batched method call so that users can handle many rows at a time for multiple keys rather than the per-row calls such as accumulate and retract. The purpose is to achieve high throughput while still allowing for calls to external systems or other blocking operations. Similar calls through the conventional AggregateFunction methods would be prohibitively slow, but if given a batch of inputs and accumulators for each key, the implementer has the power to parallelize or internally batch lookups to improve performance. Looking forward to your feedback and suggestions. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-491%3A+BundledAggregateFunction+for+batched+aggregation Thanks, Alan