jayzhan211 opened a new issue, #10205:
URL: https://github.com/apache/datafusion/issues/10205

   ### Is your feature request related to a problem or challenge?
   
   File an issue to discuss about design in #10193.
   
   Previously, we always provided a null array if the function supports zero 
args, like `random(), pi(), make_array()`.
   I think the additional null array should not provided for all the function 
that supports 0 args, instead handle case by case in each function. It turns 
out the null array is not useful either. `random`, `pi`, and `uuid` takes the 
number of rows instead of the actual null array.
   
   We need to design an alternative way to communicate the number of rows to 
the function.
   
   Proposal 1:
   Add `support_randomness -> bool` to `ScalarUDFImpl`, and we provide the 
number of rows as the first argument for `invoke`.
   
   Proposal 2:
   We always provide the number of rows to `invoke`. 
   
   Change from 
   ```rust
   fn invoke(&self, args: &[ColumnarValue]) -> Result<ColumnarValue>;
   ```
   to
   ```rust
   /// batch_rows is the number of rows in each batch
   fn invoke(&self, args: &[ColumnarValue], batch_rows: usize) -> 
Result<ColumnarValue>;
   ```
   
   ```rust
   fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
           let inputs = self
               .args
               .iter()
               .map(|e| e.evaluate(batch))
               .collect::<Result<Vec<_>>>()?;
   
           // evaluate the function
           match self.fun {
               ScalarFunctionDefinition::UDF(ref fun) => fun.invoke(&inputs, 
batch.num_rows()),
               ScalarFunctionDefinition::Name(_) => {
                   internal_err!(
                       "Name function must be resolved to one of the other 
variants prior to physical planning"
                   )
               }
           }
       }
   ```
   
   
   I prefer the second one, it is more aggressive and breaks the signature, but 
`batch` is part of the `evaluate()`, provide information about `batch` and 
`args` to `invoke()` makes sense to me 
   
   ### Describe the solution you'd like
   
   _No response_
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to