Re: RDD recomputation

2016-03-10 Thread Kevin Mellott
I've had very good success troubleshooting this type of thing by using the Spark Web UI, which will depict a breakdown of all tasks. This also includes the RDDs being used, as well as any cached data. Additional information about this tool can be found at http://spark.apache.org/docs/latest/monitor

RDD recomputation

2016-03-10 Thread souri datta
Hi, Currently I am trying to optimize my spark application and in that process, I am trying to figure out if at any stage in the code, I am recomputing a large RDD (so that I can optimize it by persisting/checkpointing it). Is there any indication in the event logs that tells us about an RDD bein