pepijnve commented on issue #17158:
URL: https://github.com/apache/datafusion/issues/17158#issuecomment-3265646839

   @alamb I had a quick look at the current implementation that handles this. I 
believe this case is now handled by `ConstEvaluator`. If I'm reading the code 
correctly the implementation currently does not take associativity into 
account. You can see the same thing happening with integer addition for 
instance:
   
   ```
   > explain select c1 + 1 + 2 + c2 from t;
   
+---------------+---------------------------------------------------------------------------------+
   | plan_type     | plan                                                       
                     |
   
+---------------+---------------------------------------------------------------------------------+
   | logical_plan  | Projection: t.c1 + Int64(1) + Int64(2) + t.c2              
                     |
   |               |   TableScan: t projection=[c1, c2]                         
                     |
   | physical_plan | ProjectionExec: expr=[c1@0 + 1 + 2 + c2@1 as t.c1 + 
Int64(1) + Int64(2) + t.c2] |
   |               |   DataSourceExec: partitions=1, partition_sizes=[1]        
                     |
   |               |                                                            
                     |
   
+---------------+---------------------------------------------------------------------------------+
   ```
   
   The current implementation of `ConstEvaluator` sees
   
   ```
   +
     +
       +
         c1
         1
       2
     c2
   ```
   
   Since each `BinaryOp` tree contains a column reference the entire tree gets 
marked as `can_evaluate: false`.
   
   In our current query engine implementation and optimiser I handled this case 
by modelling an associativity property (and commutativity for a similar 
canonicalisation rewrite rule) at the binary operator/function level. When 
encountering a tree of identical associative operators, the operands of the 
tree are collected into a list and then each subsequent pair of operands is 
tested for const evaluation.
   
   So for `c1 + 1 + 2 + c2`
   ```
   +
     +
       +
         c1
         1
       2
     c2
   ```
   becomes
   ```
   +
     c1, 1, 2, c2
   ```
   which simplifies to
   ```
   +
     c1, 6, c2
   ```
   and then gets expanded back to
   ```
   +
     +
       c1
       6
   c2
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to