Hey everyone,

When coding for my thesis, I observed that half of the current Gelly
functions (the ones that use join operators) fail on a cluster environment
with the following exception:

java.lang.IllegalArgumentException: Too few memory segments provided. Hash Join
needs at least 33 memory segments.

This is because, in 99% of the cases, the vertex data set is significantly
smaller than the edge data set. What I did to get rid of the error was the
following:

DataSet<Tuple2<Edge<K, EV>, Vertex<K, VV>>> edgesWithSources = edges
      .join(this.vertices,
JoinOperatorBase.JoinHint.BROADCAST_HASH_SECOND).where(0).equalTo(0)

In short, I added join hints. I believe this should also be in Gelly, in
case someone bumps into the same problem somewhere in the future.

What do you think?

Reply via email to