Hi, I wonder why the changes made in "[SPARK-9241][SQL] Supporting multiple DISTINCT columns (2) - Rewriting Rule" are not present in Spark (verson 2.4) now. This caused execution of count distinct in Spark much slower than Spark 1.6 and hive (Spark 2.4.4 more than 18 minutes; hive about 80s, spark 1.6 about 3 minutes).
-- Sent from Postbox <https://www.postbox-inc.com>