It may be slow because of serialization (have you tried Kryo there?) or just because at some point the data starts to be on disk. Try profiling the tasks while it’s running (e.g. just use jstack to see what they’re doing) and definitely try Kryo if you’re currently using Java Serialization. Kryo will reduce both the size on disk and the serialization time.
Matei On May 5, 2014, at 2:54 PM, Andrea Esposito <and1...@gmail.com> wrote: > Update: Checkpointing it doesn't perform. I checked by the "isCheckpointed" > method but it returns always false. ??? > > > 2014-05-05 23:14 GMT+02:00 Andrea Esposito <and1...@gmail.com>: > Checkpoint doesn't help it seems. I do it at each iteration/superstep. > > Looking deeply, the RDDs are recomputed just few times at the initial 'phase' > after they aren't recomputed anymore. I attach screenshots: bootstrap phase, > recompute section and after. This is still unexpected because i persist all > the intermediate results. > > Anyway the time of each iteration degrates perpetually, as instance: at the > first superstep it takes 3 sec and at 70 superstep it takes 8 sec. > > An iteration, looking at the screenshot, is from row 528 to 122. > > Any idea where to investigate? > > > 2014-05-02 22:28 GMT+02:00 Andrew Ash <and...@andrewash.com>: > > If you end up with a really long dependency tree between RDDs (like 100+) > people have reported success with using the .checkpoint() method. This > computes the RDD and then saves it, flattening the dependency tree. It turns > out that having a really long RDD dependency graph causes serialization sizes > of tasks to go up, plus any failures causes a long sequence of operations to > regenerate the missing partition. > > Maybe give that a shot and see if it helps? > > > On Fri, May 2, 2014 at 3:29 AM, Andrea Esposito <and1...@gmail.com> wrote: > Sorry for the very late answer. > > I carefully follow what you have pointed out and i figure out that the > structure used for each record was too big with many small objects. Changing > it the memory usage drastically decrease. > > Despite that i'm still struggling with the behaviour of decreasing > performance along supersteps. Now the memory footprint is much less than > before and GC time is not noticeable anymore. > I supposed that some RDDs are recomputed and watching carefully the stages > there is evidence of that but i don't understand why it's happening. > > Recalling my usage pattern: > newRdd = oldRdd.map(myFun).persist(myStorageLevel) > newRdd.foreach(x => {}) // Force evaluation > oldRdd.unpersist(true) > > According to my usage pattern i tried to don't unpersist the intermediate > RDDs (i.e. oldRdd) but nothing change. > > Any hints? How could i debug this? > > > > 2014-04-14 12:55 GMT+02:00 Andrew Ash <and...@andrewash.com>: > > A lot of your time is being spent in garbage collection (second image). > Maybe your dataset doesn't easily fit into memory? Can you reduce the number > of new objects created in myFun? > > How big are your heap sizes? > > Another observation is that in the 4th image some of your RDDs are massive > and some are tiny. > > > On Mon, Apr 14, 2014 at 11:45 AM, Andrea Esposito <and1...@gmail.com> wrote: > Hi all, > > i'm developing an iterative computation over graphs but i'm struggling with > some embarrassing low performaces. > > The computation is heavily iterative and i'm following this rdd usage pattern: > > newRdd = oldRdd.map(myFun).persist(myStorageLevel) > newRdd.foreach(x => {}) // Force evaluation > oldRdd.unpersist(true) > > I'm using a machine equips by 30 cores and 120 GB of RAM. > As an example i've run with a small graph of 4000 verts and 80 thousand edges > and the performance at the first iterations are 10+ minutes and after they > take lots more. > I attach the Spark UI screenshots of just the first 2 iterations. > > I tried with MEMORY_ONLY_SER and MEMORY_AND_DISK_SER and also i changed the > "spark.shuffle.memoryFraction" to 0.3 but nothing changed (with so lot of RAM > for 4E10 verts these settings are quite pointless i guess). > > How should i continue to investigate? > > Any advices are very very welcome, thanks. > > Best, > EA > > > > >