Hi Rob,
On 13 May 2016 at 11:22, Arkay <robkee...@gmail.com> wrote: > Hi to all, > > I’m aware there are a few threads on this, but I haven’t been able to solve > an issue I am seeing and hoped someone can help. I’m trying to run the > following: > > val connectedNetwork = new org.apache.flink.api.scala.DataSet[Vertex[Long, > Long]]( > Graph.fromTuple2DataSet(inputEdges, vertexInitialiser, env) > .run(new ConnectedComponents[Long, NullValue](100))) > > And hitting the error: > > java.lang.RuntimeException: Memory ran out. numPartitions: 32 minPartition: > 8 maxPartition: 8 number of overflow segments: 122 bucketSize: 206 Overall > memory: 19365888 Partition memory: 8388608 > at > > org.apache.flink.runtime.operators.hash.CompactingHashTable.getNextBuffer(CompactingHashTable.java:753) > at > > org.apache.flink.runtime.operators.hash.CompactingHashTable.insertBucketEntryFromStart(CompactingHashTable.java:546) > at > > org.apache.flink.runtime.operators.hash.CompactingHashTable.insertOrReplaceRecord(CompactingHashTable.java:423) > at > > org.apache.flink.runtime.operators.hash.CompactingHashTable.buildTableWithUniqueKey(CompactingHashTable.java:325) > at > > org.apache.flink.runtime.iterative.task.IterationHeadTask.readInitialSolutionSet(IterationHeadTask.java:212) > at > > org.apache.flink.runtime.iterative.task.IterationHeadTask.run(IterationHeadTask.java:273) > at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) > at java.lang.Thread.run(Unknown Source) > > I’m running Flink 1.0.3 on windows 10 using start-local.bat. I have Xmx > set > to 6500MB, 8 workers, parallelism 8 and other memory settings left at > default. > The start-local script will start a single JobManager and TaskManager. What do you mean by 8 workers? Have you set the numberOfTaskSlots to 8? To give all available memory to your TaskManager, you should set the "taskmanager.heap.mb" configuration option in flink-conf.yaml. Can you open the Flink dashboard at http://localhost:8081/ and check the configuration of your taskmanager? Cheers, -Vasia. > The inputEdges dataset contains 141MB of Long,Long pairs (which is around 6 > million edges). ParentID is unique and always negative, ChildID is > non-unique and always positive (simulating a bipartite graph) > > An example few rows: > -91498683401,1738 > -135344401,5370 > -100260517801,7970 > -154352186001,12311 > -160265532002,12826 > > The vast majority of the childIds are actually unique, and the most popular > ID only occurs 10 times. > > VertexInitialiser just sets the vertex value to the id. > > Hopefully this is just a memory setting I’m not seeing for the hashTable as > it dies almost instantly, I don’t think it gets very far into the dataset. > I understand that the CompactingHashTable cannot spill, but I’d be > surprised > if it needed to at these low volumes. > > Many thanks for any help! > > Rob > > > > > -- > View this message in context: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Memory-ran-out-error-when-running-connected-components-tp6888.html > Sent from the Apache Flink User Mailing List archive. mailing list archive > at Nabble.com. >