findepi commented on code in PR #17351:
URL: https://github.com/apache/datafusion/pull/17351#discussion_r2324504677


##########
datafusion/physical-expr-common/src/physical_expr.rs:
##########
@@ -377,6 +377,19 @@ pub trait PhysicalExpr: Any + Send + Sync + Display + 
Debug + DynEq + DynHash {
         // static expressions will always return 0.
         0
     }
+
+    /// Returns true if the expression node is volatile, i.e. whether it can 
return
+    /// different results when evaluated multiple times with the same input.
+    ///
+    /// Note: unlike [`is_volatile`], this function does not consider inputs:
+    /// - `random()` returns `true`,
+    /// - `a + random()` returns `false`
+    ///
+    /// By default, expressions are not volatile to avoid imposing API churn 
on implementers.
+    /// It is highly recommended that volatile expressions implement this 
method and return `true`.
+    fn is_volatile_node(&self) -> bool {

Review Comment:
   > Also maybe `is_volatile` is enough
   
   `is_volatile` should return whether the "expression is volatile".
   it would be enough, but it needs to traverse the expr tree 
   
   
   
   
   
   
   > I could see this being missed + difficult for reviewers to keep in mind 
when new physical expressions are added.
   
   The default impl is not a safe default. What happens when someone has 
existing volatile `PhysicalExpr` implementation and doesn't override this 
method?



##########
datafusion/physical-expr-common/src/physical_expr.rs:
##########
@@ -377,6 +377,19 @@ pub trait PhysicalExpr: Any + Send + Sync + Display + 
Debug + DynEq + DynHash {
         // static expressions will always return 0.
         0
     }
+
+    /// Returns true if the expression node is volatile, i.e. whether it can 
return
+    /// different results when evaluated multiple times with the same input.
+    ///
+    /// Note: unlike [`is_volatile`], this function does not consider inputs:
+    /// - `random()` returns `true`,
+    /// - `a + random()` returns `false`
+    ///
+    /// By default, expressions are not volatile to avoid imposing API churn 
on implementers.
+    /// It is highly recommended that volatile expressions implement this 
method and return `true`.
+    fn is_volatile_node(&self) -> bool {
+        false

Review Comment:
   This is likely not a safe default. What happens when someone has existing 
volatile PhysicalExpr implementation and doesn't override this method?
   
   Today the answer is "nothing", as these functions are not used for anything. 
But as soon as they are used for something (eg some optimization pass), this 
not safe a default may lead to correctness bugs (like constant-folding a rand 
call).
   
   



##########
datafusion/physical-expr-common/src/physical_expr.rs:
##########
@@ -560,3 +573,26 @@ pub fn is_dynamic_physical_expr(expr: &Arc<dyn 
PhysicalExpr>) -> bool {
     // If the generation is non-zero, then this `PhysicalExpr` is dynamic.
     snapshot_generation(expr) != 0
 }
+
+/// Returns true if the expression is volatile, i.e. whether it can return 
different
+/// results when evaluated multiple times with the same input.
+///
+/// For example the function call `RANDOM()` is volatile as each call will
+/// return a different value.
+///
+/// This method recursively checks if any sub-expression is volatile, for 
example
+/// `1 + RANDOM()` will return `true`.
+pub fn is_volatile(expr: &Arc<dyn PhysicalExpr>) -> bool {
+    let mut is_volatile = expr.is_volatile_node();
+    expr.apply(|e| {
+        is_volatile = is_volatile || e.is_volatile_node();

Review Comment:
   No need to call `expr.apply` when `is_volatile` is already true. Does it 
matter?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to