Re: [gelly] Spargel model rework

Vasiliki Kalavri Tue, 27 Oct 2015 11:11:29 -0700

@Martin: thanks for your input! If you ran into any other issues that I
didn't mention, please let us know. Obviously, even with my proposal, there
are still features we cannot support, e.g. updating edge values and graph
mutations. We'll need to re-think the underlying iteration and/or graph
representation for those.


@Fabian: thanks a lot, no rush :)
Let me give you some more information that might make it easier to reason
about performance:

Currently, in Spargel the SolutionSet (SS) keeps the vertex state and the
workset (WS) keeps the active vertices. The iteration is composed of 2
coGroups. The first one takes the WS and the edges and produces messages.
The second one takes the messages and the SS and produced the new WS and
the SS-delta.

In my proposal, the SS has the vertex state and the WS has <vertexId,
MessageIterator> pairs, i.e. the inbox of each vertex. The plan is more
complicated because compute() needs to have two iterators: over the edges
and over the messages.
First, I join SS and WS to get the active vertices (have received a msg)
and their current state. Then I coGroup the result with the edges to access
the neighbors. Now the main problem is that this coGroup needs to have 2
outputs: the new messages and the new vertex value. I couldn't really find
a nice way to do this, so I'm emitting a Tuple that contains both types and
I have a flag to separate them later with 2 flatMaps. From the vertex
flatMap, I crete the SS-delta and from the messaged flatMap I apply a
reduce to group the messages by vertex and send them to the new WS. One
optimization would be to expose a combiner here to reduce message size.

tl;dr:
1. 2 coGroups vs. Join + coGroup + flatMap + reduce
2. how can we efficiently emit 2 different types of records from a coGroup?
3. does it make any difference if we group/combine the messages before
updating the workset or after?

Cheers,
-Vasia.


On 27 October 2015 at 18:39, Fabian Hueske <fhue...@gmail.com> wrote:

> I'll try to have a look at the proposal from a performance point of view in
> the next days.
> Please ping me, if I don't follow up this thread.
>
> Cheers, Fabian
>
> 2015-10-27 18:28 GMT+01:00 Martin Junghanns <m.jungha...@mailbox.org>:
>
> > Hi,
> >
> > At our group, we also moved several algorithms from Giraph to Gelly and
> > ran into some confusing issues (first in understanding, second during
> > implementation) caused by the conceptional differences you described.
> >
> > If there are no concrete advantages (performance mainly) in the Spargel
> > implementation, we would be very happy to see the Gelly API be aligned to
> > Pregel-like systems.
> >
> > Your SSSP example speaks for itself. Straightforward, if the reader is
> > familiar with Pregel/Giraph/...
> >
> > Best,
> > Martin
> >
> >
> > On 27.10.2015 17:40, Vasiliki Kalavri wrote:
> >
> >> Hello squirrels,
> >>
> >> I want to discuss with you a few concerns I have about our current
> >> vertex-centric model implementation, Spargel, now fully subsumed by
> Gelly.
> >>
> >> Spargel is our implementation of Pregel [1], but it violates some
> >> fundamental properties of the model, as described in the paper and as
> >> implemented in e.g. Giraph, GPS, Hama. I often find myself confused both
> >> when trying to explain it to current Giraph users and when porting my
> >> Giraph algorithms to it.
> >>
> >> More specifically:
> >> - in the Pregel model, messages produced in superstep n, are received in
> >> superstep n+1. In Spargel, they are produced and consumed in the same
> >> iteration.
> >> - in Pregel, vertices are active during a superstep, if they have
> received
> >> a message in the previous superstep. In Spargel, a vertex is active
> during
> >> a superstep if it has changed its value.
> >>
> >> These two differences require a lot of rethinking when porting
> >> applications
> >> and can easily cause bugs.
> >>
> >> The most important problem however is that we require the user to split
> >> the
> >> computation in 2 phases (2 UDFs):
> >> - messaging: has access to the vertex state and can produce messages
> >> - update: has access to incoming messages and can update the vertex
> value
> >>
> >> Pregel/Giraph only expose one UDF to the user:
> >> - compute: has access to both the vertex state and the incoming
> messages,
> >> can produce messages and update the vertex value.
> >>
> >> This might not seem like a big deal, but except from forcing the user to
> >> split their program logic into 2 phases, Spargel also makes some common
> >> computation patterns non-intuitive or impossible to write. A very simple
> >> example is propagating a message based on its value or sender ID. To do
> >> this with Spargel, one has to store all the incoming messages in the
> >> vertex
> >> value (might be of different type btw) during the messaging phase, so
> that
> >> they can be accessed during the update phase.
> >>
> >> So, my first question is, when implementing Spargel, were other
> >> alternatives considered and maybe rejected in favor of performance or
> >> because of some other reason? If someone knows, I would love to hear
> about
> >> them!
> >>
> >> Second, I wrote a prototype implementation [2] that only exposes one
> UDF,
> >> compute(), by keeping the vertex state in the solution set and the
> >> messages
> >> in the workset. This way all previously mentioned limitations go away
> and
> >> the API (see "SSSPComputeFunction" in the example [3]) looks a lot more
> >> like Giraph (see [4]).
> >>
> >> I have not run any experiments yet and the prototype has some ugly
> hacks,
> >> but if you think any of this makes sense, then I'd be willing to follow
> up
> >> and try to optimize it. If we see that it performs well, we can consider
> >> either replacing Spargel or adding it as an alternative.
> >>
> >> Thanks for reading this long e-mail and looking forward to your input!
> >>
> >> Cheers,
> >> -Vasia.
> >>
> >> [1]: https://kowshik.github.io/JPregel/pregel_paper.pdf
> >> [2]:
> >>
> >>
> https://github.com/vasia/flink/tree/spargel-2/flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargelnew
> >> [3]:
> >>
> >>
> https://github.com/vasia/flink/blob/spargel-2/flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/spargelnew/example/SSSPCompute.java
> >> [4]:
> >>
> >>
> https://github.com/grafos-ml/okapi/blob/master/src/main/java/ml/grafos/okapi/graphs/SingleSourceShortestPaths.java
> >>
> >>
>

Re: [gelly] Spargel model rework

Reply via email to