[ https://issues.apache.org/jira/browse/FLINK-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot updated FLINK-5127: ---------------------------------- Labels: auto-deprioritized-major auto-unassigned stale-minor (was: auto-deprioritized-major auto-unassigned) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issues has been marked as Minor but is unassigned and neither itself nor its Sub-Tasks have been updated for 180 days. I have gone ahead and marked it "stale-minor". If this ticket is still Minor, please either assign yourself or give an update. Afterwards, please remove the label or in 7 days the issue will be deprioritized. > Reduce the amount of intermediate data in vertex-centric iterations > ------------------------------------------------------------------- > > Key: FLINK-5127 > URL: https://issues.apache.org/jira/browse/FLINK-5127 > Project: Flink > Issue Type: Improvement > Components: Library / Graph Processing (Gelly) > Affects Versions: 1.1.0, 1.2.0 > Reporter: Vasia Kalavri > Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned, stale-minor > > The vertex-centric plan contains a join between the workset (messages) and > the solution set (vertices) that outputs <Vertex, Message> tuples. This > intermediate dataset is then co-grouped with the edges to provide the Pregel > interface directly. > This issue proposes an improvement to reduce the size of this intermediate > dataset. In particular, the vertex state does not have to be attached to all > the output tuples of the join. If we replace the join with a coGroup and use > an `Either` type, we can attach the vertex state to the first tuple only. The > subsequent coGroup can retrieve the vertex state from the first tuple and > correctly expose the Pregel interface. > In my preliminary experiments, I find that this change reduces intermediate > data by 2x for small vertex state and 4-5x for large vertex states. -- This message was sent by Atlassian Jira (v8.20.1#820001)