Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Mihail Vieru Wed, 18 Mar 2015 14:14:34 -0700

I'm also using 0 as sourceID. The exact program arguments:

0 /home/vieru/dev/flink-experiments/data/social_network.edgelist/home/vieru/dev/flink-experiments/data/social_network.verticeslist/home/vieru/dev/flink-experiments/sssp-output-higgstwitter 10

And yes, I call both methods on the initialized Graph *mappedInput*. Idon't understand why the distances are computed correctly for the smallgraph (also read from files) but not for the larger one.

The messages appear to be wrong in the latter case.

On 18.03.2015 21:55, Vasiliki Kalavri wrote:

hmm, I'm starting to run out of ideas...
What's your source ID parameter? I ran mine with 0.

About the result, you call both createVertexCentricIteration() andrunVertexCentricIteration() on the initialized graph, right?

On 18 March 2015 at 22:33, Mihail Vieru <vi...@informatik.hu-berlin.de<mailto:vi...@informatik.hu-berlin.de>> wrote:


    Hi Vasia,

    yes, I am using the latest master. I just did a pull again and the
    problem persists. Perhaps Robert could confirm as well.

    I've set the solution set to unmanaged in SSSPUnweighted as
    Stephan proposed and the job finishes. So I am able to proceed
    using this workaround.

    An odd thing occurs now though. The distances aren't computed
    correctly for the SNAP graph and remain the one set in
    InitVerticesMapper(). For the small graph in SSSPDataUnweighted
    they are OK. I'm currently investigating this behavior.

    Cheers,
    Mihail


    On 18.03.2015 20:55, Vasiliki Kalavri wrote:

    Hi Mihail,

    I used your code to generate the vertex file, then gave this and
    the edge list as input to your SSSP implementation and still
    couldn't reproduce the exception. I'm using the same local setup
    as I describe above.
    I'm not aware of any recent changes that might be relevant, but,
    just in case, are you using the latest master?

    Cheers,
    V.

    On 18 March 2015 at 19:21, Mihail Vieru
    <vi...@informatik.hu-berlin.de
    <mailto:vi...@informatik.hu-berlin.de>> wrote:

        Hi Vasia,

        I have used a simple job (attached) to generate a file which
        looks like this:

        0 0
        1 1
        2 2
        ...
        456629 456629
        456630 456630

        I need the vertices to be generated from a file for my future
        work.

        Cheers,
        Mihail



        On 18.03.2015 17:04, Vasiliki Kalavri wrote:

        Hi Mihail, Robert,

        I've tried reproducing this, but I couldn't.
        I'm using the same twitter input graph from SNAP that you
        link to and also Scala IDE.
        The job finishes without a problem (both the SSSP example
        from Gelly and the unweighted version).

        The only thing I changed to run your version was creating
        the graph from the edge set only, i.e. like this:

        Graph<Long, Long, NullValue> graph = Graph.fromDataSet(edges,
        new MapFunction<Long, Long>() {
        public Long map(Long value) {
        return Long.MAX_VALUE;
        }
        }, env);
        Since the twitter input is an edge list, how do you generate
        the vertex dataset in your case?

        Thanks,
        -Vasia.

        On 18 March 2015 at 16:54, Mihail Vieru
        <vi...@informatik.hu-berlin.de
        <mailto:vi...@informatik.hu-berlin.de>> wrote:

            Hi,

            great! Thanks!

            I really need this bug fixed because I'm laying the
            groundwork for my Diplom thesis and I need to be sure
            that the Gelly API is reliable and can handle large
            datasets as intended.

            Cheers,
            Mihail


            On 18.03.2015 15:40, Robert Waury wrote:

            Hi,

            I managed to reproduce the behavior and as far as I can
            tell it seems to be a problem with the memory allocation.

            I have filed a bug report in JIRA to get the attention
            of somebody who knows the runtime better than I do.

            https://issues.apache.org/jira/browse/FLINK-1734

            Cheers,
            Robert

            On Tue, Mar 17, 2015 at 3:52 PM, Mihail Vieru
            <vi...@informatik.hu-berlin.de
            <mailto:vi...@informatik.hu-berlin.de>> wrote:

                Hi Robert,

                thank you for your reply.

                I'm starting the job from the Scala IDE. So only
                one JobManager and one TaskManager in the same JVM.
                I've doubled the memory in the eclipse.ini settings
                but I still get the Exception.

                -vmargs
                -Xmx2048m
                -Xms100m
                -XX:MaxPermSize=512m

                Best,
                Mihail


                On 17.03.2015 10:11, Robert Waury wrote:

                Hi,

                can you tell me how much memory your job has and
                how many workers you are running?

                From the trace it seems the internal hash table
                allocated only 7 MB for the graph data and
                therefore runs out of memory pretty quickly.

                Skewed data could also be an issue but with a
                minimum of 5 pages and a maximum of 8 it seems to
                be distributed fairly even to the different
                partitions.

                Cheers,
                Robert

                On Tue, Mar 17, 2015 at 1:25 AM, Mihail Vieru
                <vi...@informatik.hu-berlin.de
                <mailto:vi...@informatik.hu-berlin.de>> wrote:

                    And the correct SSSPUnweighted attached.


                    On 17.03.2015 01:23, Mihail Vieru wrote:

                        Hi,

                        I'm getting the following RuntimeException
                        for an adaptation of the
                        SingleSourceShortestPaths example using
                        the Gelly API (see attachment). It's been
                        adapted for unweighted graphs having
                        vertices with Long values.

                        As an input graph I'm using the social
                        network graph (~200MB unpacked) from here:
                        https://snap.stanford.edu/data/higgs-twitter.html

                        For the small SSSPDataUnweighted graph
                        (also attached) it terminates and computes
                        the distances correctly.


                        03/16/2015 17:18:23
                        IterationHead(WorksetIteration
                        (Vertex-centric iteration
                        
(org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$VertexDistanceUpdater@dca6fe4
                        |
                        
org.apache.flink.graph.library.SingleSourceShortestPathsUnweighted$MinDistanceMessenger@6577e8ce)))(2/4)
                        switched to FAILED
                        java.lang.RuntimeException: Memory ran
                        out. Compaction failed. numPartitions: 32
                        minPartition: 5 maxPartition: 8 number of
                        overflow segments: 176 bucketSize: 217
                        Overall memory: 20316160 Partition memory:
                        7208960 Message: Index: 8, Size: 7
                            at
                        
org.apache.flink.runtime.operators.hash.CompactingHashTable.insert(CompactingHashTable.java:390)
                            at
                        
org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTable(CompactingHashTable.java:337)
                            at
                        
org.apache.flink.runtime.iterative.task.IterationHeadPactTask.readInitialSolutionSet(IterationHeadPactTask.java:216)
                            at
                        
org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:278)
                            at
                        
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
                            at
                        
org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
                            at java.lang.Thread.run(Thread.java:745)


                        Best,
                        Mihail

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

Reply via email to