Hi Mihail, Robert, I've tried reproducing this, but I couldn't. I'm using the same twitter input graph from SNAP that you link to and also Scala IDE. The job finishes without a problem (both the SSSP example from Gelly and the unweighted version).
The only thing I changed to run your version was creating the graph from the edge set only, i.e. like this: Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges, new MapFunction<Long, Long>() { public Long map(Long value) { return Long.MAX_VALUE; } }, env); Since the twitter input is an edge list, how do you generate the vertex dataset in your case? Thanks, -Vasia. On 18 March 2015 at 16:54, Mihail Vieru <vi...@informatik.hu-berlin.de> wrote: > Hi, > > great! Thanks! > > I really need this bug fixed because I'm laying the groundwork for my > Diplom thesis and I need to be sure that the Gelly API is reliable and can > handle large datasets as intended. > > Cheers, > Mihail > > > On 18.03.2015 15:40, Robert Waury wrote: > > Hi, > > I managed to reproduce the behavior and as far as I can tell it seems to > be a problem with the memory allocation. > > I have filed a bug report in JIRA to get the attention of somebody who > knows the runtime better than I do. > > https://issues.apache.org/jira/browse/FLINK-1734 > > Cheers, > Robert > > On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru < > vi...@informatik.hu-berlin.de> wrote: > >> Hi Robert, >> >> thank you for your reply. >> >> I'm starting the job from the Scala IDE. So only one JobManager and one >> TaskManager in the same JVM. >> I've doubled the memory in the eclipse.ini settings but I still get the >> Exception. >> >> -vmargs >> -Xmx2048m >> -Xms100m >> -XX:MaxPermSize=512m >> >> Best, >> Mihail >> >> >> On 17.03.2015 10:11, Robert Waury wrote: >> >> Hi, >> >> can you tell me how much memory your job has and how many workers you >> are running? >> >> From the trace it seems the internal hash table allocated only 7 MB for >> the graph data and therefore runs out of memory pretty quickly. >> >> Skewed data could also be an issue but with a minimum of 5 pages and a >> maximum of 8 it seems to be distributed fairly even to the different >> partitions. >> >> Cheers, >> Robert >> >> On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru < >> vi...@informatik.hu-berlin.de> wrote: >> >>> And the correct SSSPUnweighted attached. >>> >>> >>> On 17.03.2015 01:23, Mihail Vieru wrote: >>> >>>> Hi, >>>> >>>> I'm getting the following RuntimeException for an adaptation of the >>>> SingleSourceShortestPaths example using the Gelly API (see attachment). >>>> It's been adapted for unweighted graphs having vertices with Long values. >>>> >>>> As an input graph I'm using the social network graph (~200MB unpacked) >>>> from here: https://snap.stanford.edu/data/higgs-twitter.html >>>> >>>> For the small SSSPDataUnweighted graph (also attached) it terminates >>>> and computes the distances correctly. >>>> >>>> >>>> 03/16/2015 17:18:23 IterationHead(WorksetIteration (Vertex-centric >>>> iteration >>>> (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4 >>>> | >>>> org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4) >>>> switched to FAILED >>>> java.lang.RuntimeException: Memory ran out. Compaction failed. >>>> numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow >>>> segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory: >>>> 7208960 Message: Index: 8, Size: 7 >>>> at >>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390) >>>> at >>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337) >>>> at >>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216) >>>> at >>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278) >>>> at >>>> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) >>>> at >>>> org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205) >>>> at java.lang.Thread.run(Thread.java:745) >>>> >>>> >>>> Best, >>>> Mihail >>>> >>> >>> >> >> > >