I'm working on this patch to visualize stages: https://github.com/apache/spark/pull/2077
Phuoc Do On Mon, Aug 4, 2014 at 10:12 PM, Zongheng Yang <zonghen...@gmail.com> wrote: > I agree that this is definitely useful. > > One related project I know of is Sparkling [1] (also see talk at Spark > Summit 2014 [2]), but it'd be great (and I imagine somewhat > challenging) to visualize the *physical execution* graph of a Spark > job. > > [1] http://pr01.uml.edu/ > [2] > http://spark-summit.org/2014/talk/sparkling-identification-of-task-skew-and-speculative-partition-of-data-for-spark-applications > > On Mon, Aug 4, 2014 at 8:55 PM, rpandya <r...@iecommerce.com> wrote: > > Is there a way to visualize the task dependency graph of an application, > > during or after its execution? The list of stages on port 4040 is useful, > > but still quite limited. For example, I've found that if I don't cache() > the > > result of one expensive computation, it will get repeated 4 times, but > it is > > not easy to trace through exactly why. Ideally, what I would like for > each > > stage is: > > - the individual tasks and their dependencies > > - the various RDD operators that have been applied > > - the full stack trace, both for the stage barrier, the task, and for the > > lambdas used (often the RDDs are manipulated inside layers of code, so > the > > immediate file/line# is not enough) > > > > Any suggestions? > > > > Thanks, > > > > Ravi > > > > > > > > -- > > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Visualizing-stage-task-dependency-graph-tp11404.html > > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > > For additional commands, e-mail: user-h...@spark.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- Phuoc Do https://vida.io/dnprock