Re: [I] `sql_planner` benchmark panic'ing on main [datafusion]

via GitHub Sun, 28 Sep 2025 03:07:41 -0700


pepijnve commented on issue #17801:
URL: https://github.com/apache/datafusion/issues/17801#issuecomment-3342774125


   There seem to be two things going on here.
   
   First, the expression simplification introduced in the commit we're looking 
at changes nullability in the schema. For the coalesce we can easily derive 
that the results is not nullable if at least one of the expressions is not 
nullable. For the `case` rewrite result this is feasible in theory, but the 
necessary analysis to conclude that `CASE WHEN x IS NOT NULL THEN x ELSE y END` 
is not nullable if `y` is not nullable does not seem to be implemented yet.
   
   Secondly, at the logical level something seems to be going wrong at the 
schema layer. After parsing the SQL, the relevant portion of the logical plan is
   
   ```
   Union [..., sales_cnt:Int64, ...]
                       Projection: ..., CAST(catalog_sales.cs_quantity AS 
Int64) - CASE WHEN __common_expr_7 IS NOT NULL THEN __common_expr_7 ELSE 
Int64(0) END AS sales_cnt, ..., sales_cnt:Int64;N, ...]
   ```
   
   Note that `sales_cnt` is marked nullable in the projection, but not nullable 
in the union. My suspicion is that after the expression simplification the 
schema is not being updated correctly.
   
   After a second optimisation pass we get
   
   ```
   Union [..., sales_cnt:Int64;N, ...]
                       Projection: ..., CAST(catalog_sales.cs_quantity AS 
Int64) - CASE WHEN __common_expr_7 IS NOT NULL THEN __common_expr_7 ELSE 
Int64(0) END AS sales_cnt, ..., sales_cnt:Int64;N, ...]
   ```
   
   In other words, the schema error seems to be getting resolved as a side 
effect of doing another rewrite pass over the tree.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] `sql_planner` benchmark panic'ing on main [datafusion]

Reply via email to