Andra Lungu created FLINK-2661:
----------------------------------
Summary: Add a Node Splitting Technique to Overcome the
Limitations of Skewed Graphs
Key: FLINK-2661
URL: https://issues.apache.org/jira/browse/FLINK-2661
Project: Flink
Issue Type: Task
Components: Gelly
Affects Versions: 0.10
Reporter: Andra Lungu
Assignee: Andra Lungu
Skewed graphs raise unique challenges to computation models such as Gelly's
vertex-centric or GSA iterations. This is mainly because of the fact that these
approaches uniformly process vertices regardless of their degree distribution.
In vertex-centric, for instance, a skewed node will take more time to process
its neighbors compared to the other nodes in the graph. The first will act as a
straggler causing the latter to remain idle until it finishes its computation.
This issue can be mitigated by splitting a high-degree node into subnodes and
evenly distributing the edges to the the resulted subvertices. The computation
will then be performed on the split vertex.
To this end, we should add a Splitting API on top of Gelly which can help:
- determine skewed nodes
- split them
- merge them back at the end of the computation, given a user defined combiner.
To illustrate the usage of these methods, we should add an example as well as a
separate entry in the documentation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)