robertrichter opened a new issue, #6397: URL: https://github.com/apache/hudi/issues/6397
**To Reproduce** Steps to reproduce the behavior: 1. Examine the sql tab in the spark history web ui after the hudi write process has finshed. **Expected behavior** To analyse performance issues in complex spark jobs (multiple joins, aggregates, intermediate dataframes, etc.) the spark history server provides very useful information. Especially the sql tab displays a good overall overview of a complex transformation with usefull metrics like partitions, input/output records and so on. The informations in the sql tab are displayed for all target file formats (parquet, orc, etc.) except hudi. It's only possible to show the physical plan in text format within the sql tab. It would be great to see the graphical dag with it's metrics. (https://spark.apache.org/docs/3.1.1/web-ui.html#sql-tab) **Environment Description** * Hudi version : 0.10.0 * Spark version : 3.1.1 * Hive version : 3.1.3 * Hadoop version : 3.1.1 * Storage (HDFS/S3/GCS..) : HDFS * Running on Docker? (yes/no) : no -> yarn on cloudera cdp 7.1.7 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
