Setting up the partitioning etc is done automatically by the optimizer.
What is missing is a hash-based aggregator operator, that the optimizer can
choose as an alternative strategy to sort-based aggregation.
A good first step would be to have a look at how the hash join works, in
order to get an
I added a comment with suggestions how to proceed in the JIRA issue.
2015-06-17 22:41 GMT+02:00 :
>
> Hello dear Developer,
> Currently aggregation functions are implemented based on sorting. We would
> like to add hash based aggregation to Flink. We would be thankful if you
> could tell as how t