Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Vasiliki Kalavri Wed, 18 Mar 2015 14:46:16 -0700

Well, one thing I notice is that your vertices and edges args are flipped.
Might be the source of error :-)


On 18 March 2015 at 23:04, Mihail Vieru <vi...@informatik.hu-berlin.de>
wrote:

>  I'm also using 0 as sourceID. The exact program arguments:
>
> 0 /home/vieru/dev/flink-experiments/data/social_network.edgelist
> /home/vieru/dev/flink-experiments/data/social_network.verticeslist
> /home/vieru/dev/flink-experiments/sssp-output-higgstwitter 10
>
> And yes, I call both methods on the initialized Graph *mappedInput*. I
> don't understand why the distances are computed correctly for the small
> graph (also read from files) but not for the larger one.
> The messages appear to be wrong in the latter case.
>
>
> On 18.03.2015 21:55, Vasiliki Kalavri wrote:
>
>  hmm, I'm starting to run out of ideas...
> What's your source ID parameter? I ran mine with 0.
> About the result, you call both createVertexCentricIteration() and
> runVertexCentricIteration() on the initialized graph, right?
>
> On 18 March 2015 at 22:33, Mihail Vieru <vi...@informatik.hu-berlin.de>
> wrote:
>
>>  Hi Vasia,
>>
>> yes, I am using the latest master. I just did a pull again and the
>> problem persists. Perhaps Robert could confirm as well.
>>
>> I've set the solution set to unmanaged in SSSPUnweighted as Stephan
>> proposed and the job finishes. So I am able to proceed using this
>> workaround.
>>
>> An odd thing occurs now though. The distances aren't computed correctly
>> for the SNAP graph and remain the one set in InitVerticesMapper(). For the
>> small graph in SSSPDataUnweighted they are OK. I'm currently investigating
>> this behavior.
>>
>> Cheers,
>> Mihail
>>
>>
>> On 18.03.2015 20:55, Vasiliki Kalavri wrote:
>>
>>  Hi Mihail,
>>
>>  I used your code to generate the vertex file, then gave this and the
>> edge list as input to your SSSP implementation and still couldn't reproduce
>> the exception. I'm using the same local setup as I describe above.
>> I'm not aware of any recent changes that might be relevant, but, just in
>> case, are you using the latest master?
>>
>>  Cheers,
>> V.
>>
>> On 18 March 2015 at 19:21, Mihail Vieru <vi...@informatik.hu-berlin.de>
>> wrote:
>>
>>>  Hi Vasia,
>>>
>>> I have used a simple job (attached) to generate a file which looks like
>>> this:
>>>
>>> 0 0
>>> 1 1
>>> 2 2
>>> ...
>>> 456629 456629
>>> 456630 456630
>>>
>>> I need the vertices to be generated from a file for my future work.
>>>
>>> Cheers,
>>> Mihail
>>>
>>>
>>>
>>> On 18.03.2015 17:04, Vasiliki Kalavri wrote:
>>>
>>>  Hi Mihail, Robert,
>>>
>>>  I've tried reproducing this, but I couldn't.
>>> I'm using the same twitter input graph from SNAP that you link to and
>>> also Scala IDE.
>>> The job finishes without a problem (both the SSSP example from Gelly and
>>> the unweighted version).
>>>
>>>  The only thing I changed to run your version was creating the graph
>>> from the edge set only, i.e. like this:
>>>
>>>  Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
>>>  new MapFunction<Long, Long>() {
>>>  public Long map(Long value) {
>>>  return Long.MAX_VALUE;
>>>  }
>>>  }, env);
>>>
>>> Since the twitter input is an edge list, how do you generate the vertex
>>> dataset in your case?
>>>
>>>  Thanks,
>>> -Vasia.
>>>
>>> On 18 March 2015 at 16:54, Mihail Vieru <vi...@informatik.hu-berlin.de>
>>> wrote:
>>>
>>>>  Hi,
>>>>
>>>> great! Thanks!
>>>>
>>>> I really need this bug fixed because I'm laying the groundwork for my
>>>> Diplom thesis and I need to be sure that the Gelly API is reliable and can
>>>> handle large datasets as intended.
>>>>
>>>> Cheers,
>>>> Mihail
>>>>
>>>>
>>>> On 18.03.2015 15:40, Robert Waury wrote:
>>>>
>>>>   Hi,
>>>>
>>>>  I managed to reproduce the behavior and as far as I can tell it seems
>>>> to be a problem with the memory allocation.
>>>>
>>>>  I have filed a bug report in JIRA to get the attention of somebody who
>>>> knows the runtime better than I do.
>>>>
>>>> https://issues.apache.org/jira/browse/FLINK-1734
>>>>
>>>>  Cheers,
>>>>  Robert
>>>>
>>>> On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru <
>>>> vi...@informatik.hu-berlin.de> wrote:
>>>>
>>>>>  Hi Robert,
>>>>>
>>>>> thank you for your reply.
>>>>>
>>>>> I'm starting the job from the Scala IDE. So only one JobManager and
>>>>> one TaskManager in the same JVM.
>>>>> I've doubled the memory in the eclipse.ini settings but I still get
>>>>> the Exception.
>>>>>
>>>>> -vmargs
>>>>> -Xmx2048m
>>>>> -Xms100m
>>>>> -XX:MaxPermSize=512m
>>>>>
>>>>> Best,
>>>>> Mihail
>>>>>
>>>>>
>>>>> On 17.03.2015 10:11, Robert Waury wrote:
>>>>>
>>>>>   Hi,
>>>>>
>>>>>  can you tell me how much memory your job has and how many workers you
>>>>> are running?
>>>>>
>>>>>  From the trace it seems the internal hash table allocated only 7 MB
>>>>> for the graph data and therefore runs out of memory pretty quickly.
>>>>>
>>>>>  Skewed data could also be an issue but with a minimum of 5 pages and
>>>>> a maximum of 8 it seems to be distributed fairly even to the different
>>>>> partitions.
>>>>>
>>>>>  Cheers,
>>>>>  Robert
>>>>>
>>>>> On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru <
>>>>> vi...@informatik.hu-berlin.de> wrote:
>>>>>
>>>>>> And the correct SSSPUnweighted attached.
>>>>>>
>>>>>>
>>>>>> On 17.03.2015 01:23, Mihail Vieru wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm getting the following RuntimeException for an adaptation of the
>>>>>>> SingleSourceShortestPaths example using the Gelly API (see attachment).
>>>>>>> It's been adapted for unweighted graphs having vertices with Long 
>>>>>>> values.
>>>>>>>
>>>>>>> As an input graph I'm using the social network graph (~200MB
>>>>>>> unpacked) from here:
>>>>>>> https://snap.stanford.edu/data/higgs-twitter.html
>>>>>>>
>>>>>>> For the small SSSPDataUnweighted graph (also attached) it terminates
>>>>>>> and computes the distances correctly.
>>>>>>>
>>>>>>>
>>>>>>> 03/16/2015 17:18:23    IterationHead(WorksetIteration
>>>>>>> (Vertex-centric iteration
>>>>>>> (org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4
>>>>>>> |
>>>>>>> org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4)
>>>>>>> switched to FAILED
>>>>>>> java.lang.RuntimeException: Memory ran out. Compaction failed.
>>>>>>> numPartitions: 32 minPartition: 5 maxPartition: 8 number of overflow
>>>>>>> segments: 176 bucketSize: 217 Overall memory: 20316160 Partition memory:
>>>>>>> 7208960 Message: Index: 8, Size: 7
>>>>>>>     at
>>>>>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
>>>>>>>     at
>>>>>>> org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
>>>>>>>     at
>>>>>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
>>>>>>>     at
>>>>>>> org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
>>>>>>>     at
>>>>>>> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
>>>>>>>     at
>>>>>>> org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
>>>>>>>     at java.lang.Thread.run(Thread.java:745)
>>>>>>>
>>>>>>>
>>>>>>> Best,
>>>>>>> Mihail
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Reply via email to