Jefffrey commented on code in PR #17536: URL: https://github.com/apache/datafusion/pull/17536#discussion_r2343365209
########## datafusion/functions-aggregate/src/average.rs: ########## @@ -62,6 +62,17 @@ make_udaf_expr_and_func!( avg_udaf ); +pub fn avg_distinct(expr: Expr) -> Expr { + Expr::AggregateFunction(datafusion_expr::expr::AggregateFunction::new_udf( + avg_udaf(), + vec![expr], + true, + None, + vec![], + None, + )) +} Review Comment: Same as how count handles it: https://github.com/apache/datafusion/blob/bfc5067718a3ddcb87531b5a9633605792078546/datafusion/functions-aggregate/src/count.rs#L71-L80 ########## datafusion/core/tests/dataframe/mod.rs: ########## @@ -496,32 +497,35 @@ async fn drop_with_periods() -> Result<()> { #[tokio::test] async fn aggregate() -> Result<()> { // build plan using DataFrame API - let df = test_table().await?; + // union so some of the distincts have a clearly distinct result + let df = test_table().await?.union(test_table().await?)?; let group_expr = vec![col("c1")]; let aggr_expr = vec![ - min(col("c12")), - max(col("c12")), - avg(col("c12")), - sum(col("c12")), - count(col("c12")), - count_distinct(col("c12")), + min(col("c4")).alias("min(c4)"), + max(col("c4")).alias("max(c4)"), + avg(col("c4")).alias("avg(c4)"), + avg_distinct(col("c4")).alias("avg_distinct(c4)"), + sum(col("c4")).alias("sum(c4)"), + sum_distinct(col("c4")).alias("sum_distinct(c4)"), + count(col("c4")).alias("count(c4)"), + count_distinct(col("c4")).alias("count_distinct(c4)"), Review Comment: I switched to `c4` from `c12` as `c12` had some precision variations for avg_distinct leading to inconsistent test results, and figured it was easier to switch columns than slap `round` on the outputs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org