I would go for the first solution with the join.
This gives the engine the highest degree of freedom:
- repartition vs. broadcast-forward
- sort-merge vs. hash-join
Best, Fabian
2015-10-28 18:45 GMT+01:00 Vasiliki Kalavri :
> Hi Martin,
>
> isn't finding the intersection of edges enough in this
Hi Martin,
isn't finding the intersection of edges enough in this case?
And assuming there are no duplicate edges, I believe a join should do the
trick.
Cheers,
-Vasia.
On 28 October 2015 at 13:15, Martin Junghanns
wrote:
> Hi all!
>
> While working on FLINK-2905, I was wondering what a good (
Hi all!
While working on FLINK-2905, I was wondering what a good (and fast) way
to compute the intersect between two data sets (Gelly vertices in my
case) with unknown size would be.
I came up with three ways to solve this:
Consider two sets:
DataSet> verticesLeft = this.getVertices();
Dat