Hi, I am seeking advice on measuring the performance of each QueryStage (QS) when AQE is enabled in Spark SQL. Specifically, I need help to automatically map a QS to its corresponding jobs (or stages) to get the QS runtime metrics.
I recorded the QS structure via a customized injected Query Stage Optimizer Rule. However, I am blocked by mapping a QS to its corresponding jobs (or stages) to aggregate its runtime metrics. I have tried the SparkListener, but neither the SparkListenerJobStart nor the SparkListenerStageSubmitted wraps the level of details that can match itself to a QS. I am thinking of re-compiling Spark to enable the mapping. However, I am not experienced in the Spark source code⦠Thanks for your help! Cheers, Chenghao