@Till:
A more meaningful example would be the following: from
{{1{1,2}},{2,{2,3}},{3,{4,5},{4{1,27 the result should be
{1,2,3,27},{4,5} because the set #1,#2 and #4 have at least one element in
common.
If you see this as a graph where the elements of the sets are nodes and a
set express a fu
Hi,
if I understood it correctly the "key" in that case would be a
fuzzy/probabilistic key. I'm not sure this can be computed using either the
sort-based or hash-based joinging/grouping strategies of Flink. Maybe we
can find something if you elaborate.
Cheers,
Aljoscha
On Wed, 25 May 2016 at 10:2
Hi Simone,
could you elaborate a little bit on the actual operation you want to
perform. Given a data set {(1, {1,2}), (2, {2,3})} what's the result of
your operation? Is the result { ({1,2}, {1,2,3}) } because the 2 is
contained in both sets?
Cheers,
Till
On Wed, May 25, 2016 at 10:22 AM, Simon
Hello,
I'm implementing MinHash for reccomendation on Flink. I'm almost done but I
need an efficient way to merge sets of similar keys together (and later
join these sets of keys with more data).
The actual data structure is of the form DataSet[(Int,Set[Int])] where the
left element of the tuple