This is an interesting issue, because, quite frankly, the join hint you
passed simply reversed the sides of the join. The algorithm is still the
same and has the same minimum memory requirements.
The fact that it made a difference is quite curious. The only thing I can
imagine is that this hint ch
Your arguments are perfectly valid. So, what I suggest is to have the
functions as they are now, e.g. groupReduceOnNeighbors
and to add a groupReduceOnNeighbors(blablaSameArguments, boolean
useJoinHints). That way, the user can decide whether they'd like to trade
speed for a program that actually
Hey,
I agree with Martin on this. It's the optimizer's job to decide the join
strategy.
Maybe the join hint worked on 99% of your cases, but we can't simply
generalize this for all datasets and algorithms and hard-code a joint hint
that assumes that the vertex set is always much smaller than the
Hi,
I guess enforcing a Join Strategy by default is not the best option
since you can't assume what the user did before actually calling the
Gelly functions and how the data looks like (maybe its one of the 1%
graphs where the relation is the other way around or the vertex data set
is very la
Hey everyone,
When coding for my thesis, I observed that half of the current Gelly
functions (the ones that use join operators) fail on a cluster environment
with the following exception:
java.lang.IllegalArgumentException: Too few memory segments provided. Hash Join
needs at least 33 memory segm