shehabgamin commented on issue #14123:
URL: https://github.com/apache/datafusion/issues/14123#issuecomment-2690645120

   We were able to work around the `count(*)` regression in Sail by doing the 
following when constructing an `Expr::AggregateFunction` for `count_udaf`:
   
https://github.com/lakehq/sail/blob/4fc24cb4162a153eaeaa9d6344ec1d0e97d7e693/crates/sail-plan/src/function/aggregate.rs#L190-L198
   ```
       let args = match arguments.as_slice() {
           [expr::Expr::Wildcard {
               qualifier: None,
               options: _,
           }] => {
               vec![expr::Expr::Literal(COUNT_STAR_EXPANSION)]
           }
           _ => arguments,
       };
   ```
   
   We assume that the aggregate function will receive wildcard expressions 
without nesting. Otherwise, the input is invalid (e.g., `count(* AS foo)`). 
Other projects adopting this workaround need to be aware of the implications 
for their use cases and determine whether they can make the same assumption.
   
   The only other regression is in the `expr_fn::substr(string, position)` UDF.
   
   ### Thoughts
   
   If `Expr::Wildcard` no longer works, why is it still being kept around? 
There is a concerning pattern of various zombie operations that no longer work, 
are surfaced in pre-release testing, and yet are still exposed to downstream 
users.  
   
   In the current and previous pre-release testing, we have surfaced the 
following zombie operations:  
   - `Expr::Wildcard { qualifier: None, .. }`  
   - `ScalarUDFImpl::invoke()`  
   - `ScalarUDFImpl::invoke_batch()`  
   - `ScalarUDFImpl::is_nullable()` (previous release testing)  
   
   If these operations no longer serve a purpose, why don't we either remove 
them entirely as proper breaking changes or modify them to throw an error upon 
construction? Alternatively, if they are no longer relevant for downstream 
users, why not restrict their exposure to internal use within DataFusion when 
applicable?
   
   Otherwise, we are knowingly releasing bugs and breaks, which defeats the 
whole purpose of pre-release testing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to