alamb commented on code in PR #10214:
URL: https://github.com/apache/datafusion/pull/10214#discussion_r1579306320
##########
datafusion/expr/src/udaf.rs:
##########
@@ -354,6 +360,24 @@ pub trait AggregateUDFImpl: Debug + Send + Sync {
fn aliases(&self) -> &[String] {
&[]
}
+
+ /// Construct an expression that calculates the aggregate in reverse.
+ /// Typically the "reverse" expression is itself (e.g. SUM, COUNT).
+ /// For aggregates that do not support calculation in reverse,
+ /// returns None (which is the default value).
+ fn reverse_expr(&self) -> ReversedExpr {
+ ReversedExpr::NotSupported
+ }
+}
+
+#[derive(Debug)]
+pub enum ReversedExpr {
+ /// The expression is the same as the original expression, like SUM, COUNT
+ Identical,
+ /// The expression does not support reverse calculation, like ArrayAgg
+ NotSupported,
Review Comment:
I spent some time trying to understand this -- is the reason that ArrayAgg
doesn't support reversed calculations that the array elements would be in a
different order?
##########
datafusion/physical-expr-common/Cargo.toml:
##########
@@ -39,3 +39,4 @@ path = "src/lib.rs"
arrow = { workspace = true }
datafusion-common = { workspace = true, default-features = true }
datafusion-expr = { workspace = true }
+sqlparser = { workspace = true }
Review Comment:
I think ideally the physical expr shouldn't be depending on sql parser (for
people who are not using SQL)
Though now I see that perhaps it is because AggregateFunction has a null
treatment flag on it 🤔 That might be a nice dependency to avoid
##########
datafusion/expr/src/udaf.rs:
##########
@@ -354,6 +360,24 @@ pub trait AggregateUDFImpl: Debug + Send + Sync {
fn aliases(&self) -> &[String] {
&[]
}
+
+ /// Construct an expression that calculates the aggregate in reverse.
+ /// Typically the "reverse" expression is itself (e.g. SUM, COUNT).
+ /// For aggregates that do not support calculation in reverse,
+ /// returns None (which is the default value).
+ fn reverse_expr(&self) -> ReversedExpr {
+ ReversedExpr::NotSupported
+ }
+}
+
+#[derive(Debug)]
+pub enum ReversedExpr {
+ /// The expression is the same as the original expression, like SUM, COUNT
+ Identical,
+ /// The expression does not support reverse calculation, like ArrayAgg
+ NotSupported,
+ /// The expression is different from the original expression
+ Reversed(AggregateFunction),
Review Comment:
I am not sure about returning an `AggregateFunction` here -- this method is
part of `AggregateUDFImpl` trait, but `AggregateFunction` has arguments and
other fields that I don't think will be accessable to an instance of
`AggregateUDFImpl`.
https://github.com/apache/datafusion/blob/4edbdd7d09d97f361748c086afbd7b3dda972f76/datafusion/expr/src/expr.rs#L578-L591
Would it make more sense to return an `AggregateUDF` here:
https://github.com/apache/datafusion/blob/a325825c8e83f18263efd899498eaadc70551de4/datafusion/expr/src/udaf.rs#L33-L67
here instead?
Maybe it would help guide the API design to implement this API for
first_value/last_value udafs 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]