gokselk commented on code in PR #13805:
URL: https://github.com/apache/datafusion/pull/13805#discussion_r1889249673


##########
datafusion/physical-expr/src/equivalence/properties.rs:
##########
@@ -3651,4 +3661,38 @@ mod tests {
 
         sort_expr
     }
+
+    #[test]
+    fn test_union_constant_value_preservation() -> Result<()> {
+        let schema = Arc::new(Schema::new(vec![
+            Field::new("a", DataType::Int32, true),
+            Field::new("b", DataType::Int32, true),
+        ]));
+
+        let col_a = col("a", &schema)?;
+        let literal_10 = ScalarValue::Int32(Some(10));
+
+        // Create first input with a=10
+        let const_expr1 =
+            ConstExpr::new(Arc::clone(&col_a)).with_value(literal_10.clone());
+        let input1 = EquivalenceProperties::new(Arc::clone(&schema))
+            .with_constants(vec![const_expr1]);
+
+        // Create second input with a=10
+        let const_expr2 =
+            ConstExpr::new(Arc::clone(&col_a)).with_value(literal_10.clone());
+        let input2 = EquivalenceProperties::new(Arc::clone(&schema))
+            .with_constants(vec![const_expr2]);
+
+        // Calculate union properties
+        let union_props = calculate_union(vec![input1, input2], schema)?;
+
+        // Verify column 'a' remains constant with value 10
+        let const_a = &union_props.constants()[0];
+        assert!(const_a.expr().eq(&col_a));
+        assert!(const_a.across_partitions());
+        assert_eq!(const_a.value(), Some(&literal_10));
+
+        Ok(())
+    }
 }

Review Comment:
   > Can you take a look at the first check @gokselk? It should take a few line 
changes in plan_with_order_preserving_variants() function. It should first look 
the order requirements, and if they are matched, then it would try to convert 
CoalescePartitionExec to SortPreservingMergeExec. But before that conversion, 
you can check across_partitions flag of the input constants, and if it is true, 
you can left the CoalescePartitionsExec as is.
   
   The issue is deeper than just checking `across_partitions`. Looking at 
`calculate_union()`, we're currently discarding all equivalence properties at 
UNION, including constants. We should preserve constants that are common across 
all UNION inputs.
   
   For example, in:
   ```sql
   (SELECT * FROM t WHERE x='a') 
   UNION ALL 
   (SELECT * FROM t WHERE x='a')
   ```
   'x' remains constant='a' across the entire UNION, but we're losing this 
information. We should modify `calculate_union()` to preserve such common 
constants.
   
   Would you like me to work on implementing this first?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to