[ https://issues.apache.org/jira/browse/FLINK-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733284#comment-15733284 ]
Vasia Kalavri commented on FLINK-1536: -------------------------------------- The idea is what [~greghogan] describes. In a distributed graph processing system, you first have to partition the graph before you perform any computation. The performance of graph algorithms greatly depends on the resulting partitioning. A bad partitioning might assign disproportionally more vertices to one partition thus hurting load balancing or it might partition the graph so that the communication required is too high (or both). Currently, we only support hash partitioning; that is, vertices are randomly assigned to workers using the hash of their id. This strategy has very low overhead and results in good load balancing unless the graphs are skewed. For more details on this problem, I suggest you read some of the papers in the literature linked in the description of the issue [~ivan.mushketyk]. > Graph partitioning operators for Gelly > -------------------------------------- > > Key: FLINK-1536 > URL: https://issues.apache.org/jira/browse/FLINK-1536 > Project: Flink > Issue Type: New Feature > Components: Gelly > Reporter: Vasia Kalavri > Assignee: Ivan Mushketyk > Priority: Minor > > Smart graph partitioning can significantly improve the performance and > scalability of graph analysis applications. Depending on the computation > pattern, a graph partitioning algorithm divides the graph into (maybe > overlapping) subgraphs, optimizing some objective. For example, if > communication is performed across graph edges, one might want to minimize the > edges that cross from one partition to another. > The problem of graph partitioning is a well studied problem and several > algorithms have been proposed in the literature. The goal of this project > would be to choose a few existing partitioning techniques and implement the > corresponding graph partitioning operators for Gelly. > Some related literature can be found [here| > http://www.citeulike.org/user/vasiakalavri/tag/graph-partitioning]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)