Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Mihail Vieru Wed, 18 Mar 2015 13:40:43 -0700

Hi Vasia,

yes, I am using the latest master. I just did a pull again and theproblem persists. Perhaps Robert could confirm as well.

I've set the solution set to unmanaged in SSSPUnweighted as Stephanproposed and the job finishes. So I am able to proceed using thisworkaround.

An odd thing occurs now though. The distances aren't computed correctlyfor the SNAP graph and remain the one set in InitVerticesMapper(). Forthe small graph in SSSPDataUnweighted they are OK. I'm currentlyinvestigating this behavior.


Cheers,
Mihail

On 18.03.2015 20:55, Vasiliki Kalavri wrote:

Hi Mihail,

I used your code to generate the vertex file, then gave this and theedge list as input to your SSSP implementation and still couldn'treproduce the exception. I'm using the same local setup as I describeabove.I'm not aware of any recent changes that might be relevant, but, justin case, are you using the latest master?


Cheers,
V.

On 18 March 2015 at 19:21, Mihail Vieru <vi...@informatik.hu-berlin.de<mailto:vi...@informatik.hu-berlin.de>> wrote:


    Hi Vasia,

    I have used a simple job (attached) to generate a file which looks
    like this:

    0 0
    1 1
    2 2
    ...
    456629 456629
    456630 456630

    I need the vertices to be generated from a file for my future work.

    Cheers,
    Mihail



    On 18.03.2015 17:04, Vasiliki Kalavri wrote:

    Hi Mihail, Robert,

    I've tried reproducing this, but I couldn't.
    I'm using the same twitter input graph from SNAP that you link to
    and also Scala IDE.
    The job finishes without a problem (both the SSSP example from
    Gelly and the unweighted version).

    The only thing I changed to run your version was creating the
    graph from the edge set only, i.e. like this:

    Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
    new MapFunction<Long, Long>() {
    public Long map(Long value) {
    return Long.MAX_VALUE;
    }
    }, env);
    Since the twitter input is an edge list, how do you generate the
    vertex dataset in your case?

    Thanks,
    -Vasia.

    On 18 March 2015 at 16:54, Mihail Vieru
    <vi...@informatik.hu-berlin.de
    <mailto:vi...@informatik.hu-berlin.de>> wrote:

        Hi,

        great! Thanks!

        I really need this bug fixed because I'm laying the
        groundwork for my Diplom thesis and I need to be sure that
        the Gelly API is reliable and can handle large datasets as
        intended.

        Cheers,
        Mihail


        On 18.03.2015 15:40, Robert Waury wrote:

        Hi,

        I managed to reproduce the behavior and as far as I can tell
        it seems to be a problem with the memory allocation.

        I have filed a bug report in JIRA to get the attention of
        somebody who knows the runtime better than I do.

        https://issues.apache.org/jira/browse/FLINK-1734

        Cheers,
        Robert

        On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru
        <vi...@informatik.hu-berlin.de
        <mailto:vi...@informatik.hu-berlin.de>> wrote:

            Hi Robert,

            thank you for your reply.

            I'm starting the job from the Scala IDE. So only one
            JobManager and one TaskManager in the same JVM.
            I've doubled the memory in the eclipse.ini settings but
            I still get the Exception.

            -vmargs
            -Xmx2048m
            -Xms100m
            -XX:MaxPermSize=512m

            Best,
            Mihail


            On 17.03.2015 10:11, Robert Waury wrote:

            Hi,

            can you tell me how much memory your job has and how
            many workers you are running?

            From the trace it seems the internal hash table
            allocated only 7 MB for the graph data and therefore
            runs out of memory pretty quickly.

            Skewed data could also be an issue but with a minimum
            of 5 pages and a maximum of 8 it seems to be
            distributed fairly even to the different partitions.

            Cheers,
            Robert

            On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru
            <vi...@informatik.hu-berlin.de
            <mailto:vi...@informatik.hu-berlin.de>> wrote:

                And the correct SSSPUnweighted attached.


                On 17.03.2015 01:23, Mihail Vieru wrote:

                    Hi,

                    I'm getting the following RuntimeException for
                    an adaptation of the SingleSourceShortestPaths
                    example using the Gelly API (see attachment).
                    It's been adapted for unweighted graphs having
                    vertices with Long values.

                    As an input graph I'm using the social network
                    graph (~200MB unpacked) from here:
                    https://snap.stanford.edu/data/higgs-twitter.html

                    For the small SSSPDataUnweighted graph (also
                    attached) it terminates and computes the
                    distances correctly.


                    03/16/2015 17:18:23
                    IterationHead(WorksetIteration (Vertex-centric
                    iteration
                    
(org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4
                    |
                    
org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4)
                    switched to FAILED
                    java.lang.RuntimeException: Memory ran out.
                    Compaction failed. numPartitions: 32
                    minPartition: 5 maxPartition: 8 number of
                    overflow segments: 176 bucketSize: 217 Overall
                    memory: 20316160 Partition memory: 7208960
                    Message: Index: 8, Size: 7
                        at
                    
org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
                        at
                    
org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
                        at
                    
org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
                        at
                    
org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
                        at
                    
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
                        at
                    
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
                        at java.lang.Thread.run(Thread.java:745)


                    Best,
                    Mihail

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Reply via email to