Re: Union of multiple datasets vs Join

2015-03-18 Thread Flavio Pompermaier
I don't know if that could be useful, do you? On Tue, Mar 17, 2015 at 10:29 PM, Stephan Ewen wrote: > Yes, that is the way to do it. > > This makes me think that it would be nice to have a method that builds the > union of a list of data sets. > > DataSet union(DataSet... sets) > > It would be i

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Robert Waury
Hi, I managed to reproduce the behavior and as far as I can tell it seems to be a problem with the memory allocation. I have filed a bug report in JIRA to get the attention of somebody who knows the runtime better than I do. https://issues.apache.org/jira/browse/FLINK-1734 Cheers, Robert On Tu

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Mihail Vieru
Hi, great! Thanks! I really need this bug fixed because I'm laying the groundwork for my Diplom thesis and I need to be sure that the Gelly API is reliable and can handle large datasets as intended. Cheers, Mihail On 18.03.2015 15:40, Robert Waury wrote: Hi, I managed to reproduce the beh

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Stephan Ewen
This job probably suffers from overly conservative memory assignment, giving the solution set too little memory. Can you try to make the solution set "unmanaged", excluding it from Flink's memory management? That may help with the problem. See here: https://github.com/apache/flink/blob/master/fli

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
Hi Mihail, Robert, I've tried reproducing this, but I couldn't. I'm using the same twitter input graph from SNAP that you link to and also Scala IDE. The job finishes without a problem (both the SSSP example from Gelly and the unweighted version). The only thing I changed to run your version was

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Robert Waury
Hi Vasia, How much memory does your job use? I think the problem is as Stephan says a too conservative allocation but that it will work if you throw enough memory at it. Or did your setup succeed with an amount of memory comparable to Mihail's and mine? My main point is that it shouldn't take 1

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
Hi Robert, my setup has even less memory than your setup, ~900MB in total. When using the local environment (running the job through your IDE), the available of memory is split equally between the JobManager and TaskManager. Then, the default memory kept for network buffers is subtracted from the

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Mihail Vieru
Hi Vasia, I have used a simple job (attached) to generate a file which looks like this: 0 0 1 1 2 2 ... 456629 456629 456630 456630 I need the vertices to be generated from a file for my future work. Cheers, Mihail On 18.03.2015 17:04, Vasiliki Kalavri wrote: Hi Mihail, Robert, I've trie

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
Hi Mihail, I used your code to generate the vertex file, then gave this and the edge list as input to your SSSP implementation and still couldn't reproduce the exception. I'm using the same local setup as I describe above. I'm not aware of any recent changes that might be relevant, but, just in ca

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Mihail Vieru
Hi Vasia, yes, I am using the latest master. I just did a pull again and the problem persists. Perhaps Robert could confirm as well. I've set the solution set to unmanaged in SSSPUnweighted as Stephan proposed and the job finishes. So I am able to proceed using this workaround. An odd thin

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
hmm, I'm starting to run out of ideas... What's your source ID parameter? I ran mine with 0. About the result, you call both createVertexCentricIteration() and runVertexCentricIteration() on the initialized graph, right? On 18 March 2015 at 22:33, Mihail Vieru wrote: > Hi Vasia, > > yes, I am u

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Mihail Vieru
I'm also using 0 as sourceID. The exact program arguments: 0 /home/vieru/dev/flink-experiments/data/social_network.edgelist /home/vieru/dev/flink-experiments/data/social_network.verticeslist /home/vieru/dev/flink-experiments/sssp-output-higgstwitter 10 And yes, I call both methods on the init

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
Well, one thing I notice is that your vertices and edges args are flipped. Might be the source of error :-) On 18 March 2015 at 23:04, Mihail Vieru wrote: > I'm also using 0 as sourceID. The exact program arguments: > > 0 /home/vieru/dev/flink-experiments/data/social_network.edgelist > /home/vi

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
haha, yes, actually I just confirmed! If I flip my args, I get the error you mention in the first e-mail. you're trying to generate a graph giving the edge list as a vertex list and this is a way too big dataset for your memory settings (cmp. ~15m edges vs. the actual 400k). I hope that clear ever

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Mihail Vieru
n way... that was it!? :))) Big thanks! :) The result is also correct now. Cheers, M. On 18.03.2015 22:49, Vasiliki Kalavri wrote: haha, yes, actually I just confirmed! If I flip my args, I get the error you mention in the first e-mail. you're trying to generate a graph giving the edge