Hi squirrels,

here's some more Gelly iteration love for you ;)

Together with Kien and Nikola (students at KTH, cc-ed), we looked into
partition-centric iterations for Gelly. The idea of the partition-centric
model is to expose the graph partition structure to the user (not just a
single vertex, but all vertices belonging to the same partition at once).
It allows for local graph structure exploitation inside a partition and
potentially avoids unnecessary communication. The model was first
introduced in [1] and we used it as inspiration, but our current
implementation is closer to the Spargel model.

Kien and Nikola prepared some slides with their design choices and
experiments [2]. The connected components results are quite impressive :)
There is still room for optimization, but you can find the wip repository
in [3] if you're interested.

What do you think about exposing this as a Gelly iteration model? The task
has been in the Gelly roadmap and there are certainly performance
advantages for some algorithms. This implementation can be basically
considered as a Spargel optimization (existing VertexUpdateFunctions from
Spargel can actually be re-used). On the other hand, the model is quite
low-level compared to Spargel and vertex-centric, since users have to deal
with inside-partition algorithms themselves (e.g. compare the connected
components implementation [4] with the existing Spargel/GSA ones).

Thanks!
-Vasia.

[1]: http://researcher.ibm.com/researcher/files/us-ytian/giraph++.pdf
[2]:
https://drive.google.com/file/d/0BzQJrI2eGlyYRmlKTjMyVEk1T0U/view?usp=sharing
[3]: https://github.com/vasia/gelly-partition-centric
[4]:
https://github.com/vasia/gelly-partition-centric/blob/master/src/main/java/org/apache/flink/graph/partition/centric/library/PCConnectedComponents.java

Reply via email to