You can see the SQL plan node name in the DAG visualization. Please refer
to https://spark.apache.org/docs/latest/web-ui.html for more details. If
you still have any confusion, please let us know and we will keep improving
the document.
On Tue, May 25, 2021 at 4:41 AM mhawes wrote:
> @Wenchen Fa
@Wenchen Fan, understood that the mapping of query plan to application code
is very hard. I was wondering if we might be able to instead just handle the
mapping from the final physical plan to the stage graph. So for example
you’d be able to tell what part of the plan generated which stages. I feel
>> plan’s structure relates to their code.
>>
>>
>>
>> *From: *mhawes
>> *Date: *Friday, 21 May 2021 at 22:36
>> *To: *dev@spark.apache.org
>> *Subject: *Re: Bridging gap between Spark UI and Code
>>
>> CAUTION: This email originates from an external
es. But maybe
> just as starting point Spark could display the call-site only with
> unoptimized logical plans? Users would still get a better sense for how the
> plan’s structure relates to their code.
>
>
>
> *From: *mhawes
> *Date: *Friday, 21 May 2021 at 22:36
> *To: *dev
y 2021 at 22:36
To: dev@spark.apache.org
Subject: Re: Bridging gap between Spark UI and Code
CAUTION: This email originates from an external party (outside of Palantir). If
you believe this message is suspicious in nature, please use the "Report
Phishing" button built into Outlook.
Revi
Reviving this thread to ask whether any of the Spark maintainers would
consider helping to scope a solution for this. Michal outlines the problem
in this thread, but to clarify. The issue is that for very complex spark
application where the Logical Plans often span many pages, it is extremely
hard
And to be clear. Yes, execution plans show what exactly it's doing. The
problem is that it's unclear how it's related to the actual Scala/Python
code.
On 7/21/20 15:45, Michal Sankot wrote:
Yes, the problem is that DAGs only refer to code line (action) that
inovked it. It doesn't provide infor
Yes, the problem is that DAGs only refer to code line (action) that
inovked it. It doesn't provide information about how individual
transformations link to the code.
So you can have dozen of stages, each with the same code line which
invoked it, doing different stuff. And then we guess what it
Have you looked in the DAG visualization? Each block refer to the code line
invoking it.
For Dataframes the execution plan will let you know explicitly which
operations are in which stages.
On Tue, Jul 21, 2020, 8:18 AM Michal Sankot
wrote:
> Hi,
> when I analyze and debug our Spark batch jobs