Hi Vasia, thanks for your reply. It helped a lot and I got some new ideas.
a) As you said, I did use the getPreviousIterationAggregate() method in preSuperstep() of the next superstep. However, if the (only?) global (aggregate) results can not be guaranteed to be consistency, what should we do with the postSuperstep() method? b) Though we can active vertices by update method or messages, IMO, it may be more proper for users themselves to decide when to halt a vertex's iteration. Considering a complex algorithm that contains different phases inside a vertex-centric iteration. Before moving to the next phase (that should be synchronized), there may be some vertices that already finished their work in current phase and they just wait for others. Users may choose the finished vertices to idle until the next phase, but rather than to halt them. Can we consider adding the voteToHalt() method and some internal variables to the Vertex/Edge class (or just create an "advanced" version of them) to make the halting more controllable? c) Sorry that I didn't make it clear before. Here the initialization means a "global" one that executes once before the iteration. For example, users may want to initialize the vertices' values by their adjacent edges before the iteration starts. Maybe we can add an extra coGroupFunction to the configuration parameters and apply it before the iteration? What do you think? (BTW, I started a PR on FLINK-1526(MST Lib&Example). Considering the complexity, the example is not provided.) Really appreciate for all your help. Best, Xingcan On Thu, Feb 9, 2017 at 5:36 PM, Vasiliki Kalavri <vasilikikala...@gmail.com> wrote: > Hi Xingcan, > > On 7 February 2017 at 10:10, Xingcan Cui <xingc...@gmail.com> wrote: > >> Hi all, >> >> I got some question about the vertex-centric iteration in Gelly. >> >> a) It seems the postSuperstep method is called before the superstep >> barrier (I got different aggregate values of the same superstep in this >> method). Is this a bug? Or the design is just like that? >> > > The postSuperstep() method is called inside the close() method of a > RichCoGroupFunction that wraps the ComputeFunction. The close() method It > is called after the last call to the coGroup() after each iteration > superstep. > The aggregate values are not guaranteed to be consistent during the same > superstep when they are computed. To retrieve an aggregate value for > superstep i, you should use the getPreviousIterationAggregate() method in > superstep i+1. > > >> >> b) There is not setHalt method for vertices. When no message received, a >> vertex just quit the next iteration. Should I manually send messages (like >> heartbeat) to keep the vertices active? >> > > That's because vertex halting is implicitly controlled by the underlying > delta iterations of Flink. A vertex will remain active as long as it > receives a message or it updates its value, otherwise it will become > inactive. The documentation on Gelly iterations [1] and DataSet iterations > [2] might be helpful. > > > >> >> c) I think we may need an initialization method in the ComputeFunction. >> > > > There exists a preSuperstep() method for initialization. This one will be > executed once per superstep before the compute function is invoked for > every vertex. Would this work for you? > > > >> >> Any opinions? Thanks. >> >> Best, >> Xingcan >> >> >> > I hope this helps, > -Vasia. > > > [1]: https://ci.apache.org/projects/flink/flink-docs- > release-1.2/dev/libs/gelly/iterative_graph_processing. > html#vertex-centric-iterations > [2]: https://ci.apache.org/projects/flink/flink-docs- > release-1.2/dev/batch/iterations.html > >