I'm working on this patch to visualize stages:

https://github.com/apache/spark/pull/2077

Phuoc Do


On Mon, Aug 4, 2014 at 10:12 PM, Zongheng Yang <zonghen...@gmail.com> wrote:

> I agree that this is definitely useful.
>
> One related project I know of is Sparkling [1] (also see talk at Spark
> Summit 2014 [2]), but it'd be great (and I imagine somewhat
> challenging) to visualize the *physical execution* graph of a Spark
> job.
>
> [1] http://pr01.uml.edu/
> [2]
> http://spark-summit.org/2014/talk/sparkling-identification-of-task-skew-and-speculative-partition-of-data-for-spark-applications
>
> On Mon, Aug 4, 2014 at 8:55 PM, rpandya <r...@iecommerce.com> wrote:
> > Is there a way to visualize the task dependency graph of an application,
> > during or after its execution? The list of stages on port 4040 is useful,
> > but still quite limited. For example, I've found that if I don't cache()
> the
> > result of one expensive computation, it will get repeated 4 times, but
> it is
> > not easy to trace through exactly why. Ideally, what I would like for
> each
> > stage is:
> > - the individual tasks and their dependencies
> > - the various RDD operators that have been applied
> > - the full stack trace, both for the stage barrier, the task, and for the
> > lambdas used (often the RDDs are manipulated inside layers of code, so
> the
> > immediate file/line# is not enough)
> >
> > Any suggestions?
> >
> > Thanks,
> >
> > Ravi
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Visualizing-stage-task-dependency-graph-tp11404.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> > For additional commands, e-mail: user-h...@spark.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


-- 
Phuoc Do
https://vida.io/dnprock

Reply via email to