Hi Gábor,
thank you very much for your explanation, that makes a lot of sense.
Best regards,
Urs
On 05.09.2017 14:32, Gábor Gévay wrote:
> Hi Urs,
>
> Yes, the 1/10th ratio is just a very loose rule of thumb. I would
> suggest to try both the SORT and HASH strategies with a workload that
> is a
Hi Urs,
Yes, the 1/10th ratio is just a very loose rule of thumb. I would
suggest to try both the SORT and HASH strategies with a workload that
is as similar as possible to your production workload (similar data,
similar parallelism, etc.), and see which one is faster for your
specific use case.
Hi,
I would say that your assumption is correct and that the COMBINE strategy does
in fact also depend on the ration " #total records/#records that fit into a
single Sorter/Hashtable".
I'm CC'ing Fabian, just to be sure. He knows that stuff better than I do.
Best,
Aljoscha
> On 31. Aug 2017,
Hi all,
I was wondering about the heuristics for CombineHint:
Flink uses SORT by default, but the doc for HASH says that we should
expect it to be faster if the number of keys is less than 1/10th of the
number of records.
HASH should be faster if it is able to combine a lot of records, which
hap