[ https://issues.apache.org/jira/browse/SPARK-51505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ziqi Liu updated SPARK-51505: ----------------------------- Description: There're cases where shuffle is highly skewed and many partitions are empty(probably due to small NDV), AQE coalesce metrics might look confusing and user might think it wrongly coalesce to large partitions, while the actual situation is that a few partitions are super large while others are empty. We'd better log empty partition number in the metrics. was: There're cases where shuffle is highly skewed and many partitions (probably due to small NDV), AQE coalesce metrics might look confusing and user might think it wrongly coalesce to large partitions, while the actual situation is that a few partitions are super large while others are empty. We'd better log empty partition number in the metrics. > Log empty partition number metrics in AQE coalesce > -------------------------------------------------- > > Key: SPARK-51505 > URL: https://issues.apache.org/jira/browse/SPARK-51505 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 4.0.0 > Reporter: Ziqi Liu > Priority: Major > > There're cases where shuffle is highly skewed and many partitions are > empty(probably due to small NDV), AQE coalesce metrics might look confusing > and user might think it wrongly coalesce to large partitions, while the > actual situation is that a few partitions are super large while others are > empty. > We'd better log empty partition number in the metrics. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org