Re: [gelly] Spargel model rework

2016-01-06 Thread Vasiliki Kalavri
issue created: https://issues.apache.org/jira/browse/FLINK-3207 If anyone has any other suggestion about the renaming, let me know :) -V. On 5 January 2016 at 11:52, Aljoscha Krettek wrote: > Nice to hear. :D > > I think you can go ahead and add the Jira. About the renaming: I also > think tha

Re: [gelly] Spargel model rework

2016-01-06 Thread Stephan Ewen
+1 for the renaming! On Wed, Jan 6, 2016 at 8:01 PM, Vasiliki Kalavri wrote: > issue created: https://issues.apache.org/jira/browse/FLINK-3207 > > If anyone has any other suggestion about the renaming, let me know :) > > -V. > > On 5 January 2016 at 11:52, Aljoscha Krettek wrote: > > > Nice to

Re: [gelly] Spargel model rework

2016-01-05 Thread Aljoscha Krettek
Nice to hear. :D I think you can go ahead and add the Jira. About the renaming: I also think that it would make sense to do it. > On 04 Jan 2016, at 19:48, Vasiliki Kalavri wrote: > > Hello squirrels and happy new year! > > I'm reviving this thread to share some results and discuss next steps.

Re: [gelly] Spargel model rework

2016-01-04 Thread Vasiliki Kalavri
Hello squirrels and happy new year! I'm reviving this thread to share some results and discuss next steps. Using the Either type I was able to get rid of redundant messages and vertex state. During the past few weeks, I have been running experiments, which show that the performance of this "Prege

Re: [gelly] Spargel model rework

2015-11-11 Thread Stephan Ewen
See: https://issues.apache.org/jira/browse/FLINK-3002 On Wed, Nov 11, 2015 at 10:54 AM, Stephan Ewen wrote: > "Either" an "Optional" types are quite useful. > > Let's add them to the core Java API. > > On Wed, Nov 11, 2015 at 10:00 AM, Vasiliki Kalavri < > vasilikikala...@gmail.com> wrote: > >>

Re: [gelly] Spargel model rework

2015-11-11 Thread Stephan Ewen
"Either" an "Optional" types are quite useful. Let's add them to the core Java API. On Wed, Nov 11, 2015 at 10:00 AM, Vasiliki Kalavri < vasilikikala...@gmail.com> wrote: > Thanks Fabian! I'll try that :) > > On 10 November 2015 at 22:31, Fabian Hueske wrote: > > > You could implement a Java Ei

Re: [gelly] Spargel model rework

2015-11-11 Thread Vasiliki Kalavri
Thanks Fabian! I'll try that :) On 10 November 2015 at 22:31, Fabian Hueske wrote: > You could implement a Java Either type (similar to Scala's Either) that > either has a Message or the VertexState and a corresponding TypeInformation > and TypeSerializer that serializes a byte flag to indicate

Re: [gelly] Spargel model rework

2015-11-10 Thread Fabian Hueske
You could implement a Java Either type (similar to Scala's Either) that either has a Message or the VertexState and a corresponding TypeInformation and TypeSerializer that serializes a byte flag to indicate which both types is used. It might actually make sense, to add a generic Either type to the

Re: [gelly] Spargel model rework

2015-11-10 Thread Vasiliki Kalavri
Hi, after running a few experiments, I can confirm that putting the combiner after the flatMap is indeed more efficient. I ran SSSP and Connected Components with Spargel, GSA, and the Pregel model and the results are the following: - for SSSP, Spargel is always the slowest, GSA is a ~1.2x faster

Re: [gelly] Spargel model rework

2015-11-09 Thread Fabian Hueske
Hi Vasia, sorry for the late reply. I don't think there is a big difference. In both cases, the partitioning and sorting happens at the end of the iteration. If the groupReduce is applied before the workset is returned, the sorting happens on the filtered result (after the flatMap) which might be

Re: [gelly] Spargel model rework

2015-11-05 Thread Vasiliki Kalavri
@Fabian Is there any advantage in putting the reducer-combiner before updating the workset vs. after (i.e. right before the join with the solution set)? If it helps, here are the plans of these 2 alternatives: https://drive.google.com/file/d/0BzQJrI2eGlyYcFV2RFo5dUFNXzg/view?usp=sharing https://

Re: [gelly] Spargel model rework

2015-11-05 Thread Stephan Ewen
Sounds good. I like the idea of presenting it as a spectrum: Pregel -> Scatter/Gather (Spargel) -> GAS/GSA/SGA On Tue, Nov 3, 2015 at 5:55 PM, Martin Neumann wrote: > The problem with having many different graph model in gelly is that it > might get quite confusing for a user. > Maybe this can

Re: [gelly] Spargel model rework

2015-11-03 Thread Martin Neumann
The problem with having many different graph model in gelly is that it might get quite confusing for a user. Maybe this can be fixed with good documentation so that its clear how each model works and what its benefits are (and maybe when its better to use it over a different model). On Tue, Nov 3,

Re: [gelly] Spargel model rework

2015-11-03 Thread Andra Lungu
I also think a Giraph-like model could be added, but we shouldn't remove Spargel in favour of it! On Tue, Nov 3, 2015 at 2:35 AM, Stephan Ewen wrote: > When creating the original version of Spargel I was pretty much thinking in > GSA terms, more than in Pregel terms. There are some fundamental >

Re: [gelly] Spargel model rework

2015-11-03 Thread Vasiliki Kalavri
Thanks for the detailed explanation Stephan! Seeing Spargel as a Gather-Scatter model makes much more sense :) I think we should be more careful not to present it as a "Pregel equivalent" to avoid confusion of users coming from systems like Giraph. Maybe I could put together a comparison table Pre

Re: [gelly] Spargel model rework

2015-11-03 Thread Martin Neumann
I tried out Spargel during my work with Spotify and have implemented several algorithms using it. In all implementations I ended up storing additional Data and Flags on the Vertex to carry them over from one UDF to the next one. It definitely makes the code harder to write and maintain. I wonder h

Re: [gelly] Spargel model rework

2015-11-02 Thread Stephan Ewen
Actually GAS was not known when we did the iterations work (and Spargel), but the intuition that led to Spargel is similar then the intuition that led to GAS. On Mon, Nov 2, 2015 at 4:35 PM, Stephan Ewen wrote: > When creating the original version of Spargel I was pretty much thinking > in GSA t

Re: [gelly] Spargel model rework

2015-11-02 Thread Stephan Ewen
When creating the original version of Spargel I was pretty much thinking in GSA terms, more than in Pregel terms. There are some fundamental differences between Spargel and Pregel. Spargel is in between GAS and Pregel in some way, that is how I have always thought about it. The main reason for the

Re: [gelly] Spargel model rework

2015-10-30 Thread Fabian Hueske
We can of course inject an optional ReduceFunction (or GroupReduce, or combinable GroupReduce) to reduce the size of the work set. I suggested to remove the GroupReduce function, because it did only collect all messages into a single record by emitting the input iterator which is quite dangerous. A

Re: [gelly] Spargel model rework

2015-10-30 Thread Vasiliki Kalavri
Hi Fabian, thanks so much for looking into this so quickly :-) One update I have to make is that I tried running a few experiments with this on a 6-node cluster. The current implementation gets stuck at "Rebuilding Workset Properties" and never finishes a single iteration. Running the plan of one

Re: [gelly] Spargel model rework

2015-10-30 Thread Fabian Hueske
Hi Vasia, I had a look at your new implementation and have a few ideas for improvements. 1) Sending out the input iterator as you do in the last GroupReduce is quite dangerous and does not give a benefit compared to collecting all elements. Even though it is an iterator, it needs to be completely

Re: [gelly] Spargel model rework

2015-10-27 Thread Vasiliki Kalavri
@Martin: thanks for your input! If you ran into any other issues that I didn't mention, please let us know. Obviously, even with my proposal, there are still features we cannot support, e.g. updating edge values and graph mutations. We'll need to re-think the underlying iteration and/or graph repre

Re: [gelly] Spargel model rework

2015-10-27 Thread Fabian Hueske
I'll try to have a look at the proposal from a performance point of view in the next days. Please ping me, if I don't follow up this thread. Cheers, Fabian 2015-10-27 18:28 GMT+01:00 Martin Junghanns : > Hi, > > At our group, we also moved several algorithms from Giraph to Gelly and > ran into s

Re: [gelly] Spargel model rework

2015-10-27 Thread Martin Junghanns
Hi, At our group, we also moved several algorithms from Giraph to Gelly and ran into some confusing issues (first in understanding, second during implementation) caused by the conceptional differences you described. If there are no concrete advantages (performance mainly) in the Spargel impl

[gelly] Spargel model rework

2015-10-27 Thread Vasiliki Kalavri
Hello squirrels, I want to discuss with you a few concerns I have about our current vertex-centric model implementation, Spargel, now fully subsumed by Gelly. Spargel is our implementation of Pregel [1], but it violates some fundamental properties of the model, as described in the paper and as im