Re: Bridging gap between Spark UI and Code

2021-05-25 Thread Wenchen Fan
You can see the SQL plan node name in the DAG visualization. Please refer to https://spark.apache.org/docs/latest/web-ui.html for more details. If you still have any confusion, please let us know and we will keep improving the document. On Tue, May 25, 2021 at 4:41 AM mhawes wrote: > @Wenchen Fa

Re: Bridging gap between Spark UI and Code

2021-05-24 Thread mhawes
@Wenchen Fan, understood that the mapping of query plan to application code is very hard. I was wondering if we might be able to instead just handle the mapping from the final physical plan to the stage graph. So for example you’d be able to tell what part of the plan generated which stages. I feel

Re: Bridging gap between Spark UI and Code

2021-05-24 Thread Mich Talebzadeh
>> plan’s structure relates to their code. >> >> >> >> *From: *mhawes >> *Date: *Friday, 21 May 2021 at 22:36 >> *To: *dev@spark.apache.org >> *Subject: *Re: Bridging gap between Spark UI and Code >> >> CAUTION: This email originates from an external

Re: Bridging gap between Spark UI and Code

2021-05-24 Thread Wenchen Fan
es. But maybe > just as starting point Spark could display the call-site only with > unoptimized logical plans? Users would still get a better sense for how the > plan’s structure relates to their code. > > > > *From: *mhawes > *Date: *Friday, 21 May 2021 at 22:36 > *To: *dev

Re: Bridging gap between Spark UI and Code

2021-05-24 Thread Will Raschkowski
y 2021 at 22:36 To: dev@spark.apache.org Subject: Re: Bridging gap between Spark UI and Code CAUTION: This email originates from an external party (outside of Palantir). If you believe this message is suspicious in nature, please use the "Report Phishing" button built into Outlook. Revi

Re: Bridging gap between Spark UI and Code

2021-05-21 Thread mhawes
Reviving this thread to ask whether any of the Spark maintainers would consider helping to scope a solution for this. Michal outlines the problem in this thread, but to clarify. The issue is that for very complex spark application where the Logical Plans often span many pages, it is extremely hard

Re: Bridging gap between Spark UI and Code

2020-07-21 Thread Michal Sankot
And to be clear. Yes, execution plans show what exactly it's doing. The problem is that it's unclear how it's related to the actual Scala/Python code. On 7/21/20 15:45, Michal Sankot wrote: Yes, the problem is that DAGs only refer to code line (action) that inovked it. It doesn't provide infor

Re: Bridging gap between Spark UI and Code

2020-07-21 Thread Michal Sankot
Yes, the problem is that DAGs only refer to code line (action) that inovked it. It doesn't provide information about how individual transformations link to the code. So you can have dozen of stages, each with the same code line which invoked it, doing different stuff. And then we guess what it

Re: Bridging gap between Spark UI and Code

2020-07-21 Thread Russell Spitzer
Have you looked in the DAG visualization? Each block refer to the code line invoking it. For Dataframes the execution plan will let you know explicitly which operations are in which stages. On Tue, Jul 21, 2020, 8:18 AM Michal Sankot wrote: > Hi, > when I analyze and debug our Spark batch jobs