alamb commented on code in PR #14094:
URL: https://github.com/apache/datafusion/pull/14094#discussion_r1920358041


##########
datafusion/expr/src/udf.rs:
##########
@@ -342,6 +348,56 @@ pub struct ScalarFunctionArgs<'a> {
     pub return_type: &'a DataType,
 }
 
+#[derive(Debug)]
+pub struct ReturnTypeArgs<'a> {

Review Comment:
   Here are some suggestions to improve the comments:
   
   ```suggestion
   /// Arguments passed to [`ScalarUDFImpl::return_type_from_args`]
   #[derive(Debug)]
   pub struct ReturnTypeArgs<'a> {
   ```



##########
datafusion/expr/src/udf.rs:
##########
@@ -342,6 +348,56 @@ pub struct ScalarFunctionArgs<'a> {
     pub return_type: &'a DataType,
 }
 
+#[derive(Debug)]
+pub struct ReturnTypeArgs<'a> {
+    /// The data types of the arguments to the function
+    pub arg_types: &'a [DataType],
+    /// The Utf8 arguments to the function, if the expression is not Utf8, it 
will be empty string

Review Comment:
   If possible I suggest using the following type:
   1. It can distinguish between string/non-string arguments
   3. It avoids clones
   
   ```rust
       pub arguments: &'a [Option<&'a ScalarValue>]>`,
   ```
   



##########
datafusion/expr/src/udf.rs:
##########
@@ -490,6 +547,40 @@ pub trait ScalarUDFImpl: Debug + Send + Sync {
         self.return_type(arg_types)
     }
 
+    /// What [`DataType`] will be returned by this function, given the

Review Comment:
   I also recommend removing the comments from `return_type_from_exprs` now 
that it is deprecated and you have moved the content here



##########
datafusion/expr/src/udf.rs:
##########
@@ -481,6 +537,7 @@ pub trait ScalarUDFImpl: Debug + Send + Sync {
     /// This function must consistently return the same type for the same
     /// logical input even if the input is simplified (e.g. it must return the 
same
     /// value for `('foo' | 'bar')` as it does for ('foobar').
+    #[deprecated(since = "45.0.0", note = "Use `return_type_from_args` 
instead")]

Review Comment:
   Can you also please update the comments on `return_type` to refer to 
`return_type_from_args` (it currently talks about `return_type_from_exprs`:
   
   ```rust
       /// If you provide an implementation for 
[`Self::return_type_from_exprs`],
       /// DataFusion will not call `return_type` (this function). In this case 
it
       /// is recommended to return [`DataFusionError::Internal`].
   ```



##########
datafusion/functions/src/core/named_struct.rs:
##########
@@ -44,11 +44,18 @@ fn named_struct_expr(args: &[ColumnarValue]) -> 
Result<ColumnarValue> {
         .chunks_exact(2)
         .enumerate()
         .map(|(i, chunk)| {
-
             let name_column = &chunk[0];
             let name = match name_column {
-                ColumnarValue::Scalar(ScalarValue::Utf8(Some(name_scalar))) => 
name_scalar,
-                _ => return exec_err!("named_struct even arguments must be 
string literals, got {name_column:?} instead at position {}", i * 2)
+                ColumnarValue::Scalar(ScalarValue::Utf8(Some(name_scalar))) => 
{
+                    name_scalar
+                }
+                // TODO: Implement Display for ColumnarValue

Review Comment:
   That is a good todo -- maybe we can make a follow on ticket for it



##########
datafusion/functions/src/core/coalesce.rs:
##########
@@ -174,39 +177,3 @@ fn get_coalesce_doc() -> &'static Documentation {
             .build()
     })
 }
-
-#[cfg(test)]

Review Comment:
   why remove these tests?



##########
datafusion/physical-expr/src/scalar_function.rs:
##########
@@ -83,6 +86,53 @@ impl ScalarFunctionExpr {
         }
     }
 
+    /// Create a new Scalar function
+    pub fn try_new(
+        fun: Arc<ScalarUDF>,
+        args: Vec<Arc<dyn PhysicalExpr>>,
+        schema: &Schema,
+    ) -> Result<Self> {
+        let name = fun.name().to_string();
+        let arg_types = args
+            .iter()
+            .map(|e| e.data_type(schema))
+            .collect::<Result<Vec<_>>>()?;
+
+        // verify that input data types is consistent with function's 
`TypeSignature`
+        data_types_with_scalar_udf(&arg_types, &fun)?;
+
+        let nullables = args
+            .iter()
+            .map(|e| e.nullable(schema))
+            .collect::<Result<Vec<_>>>()?;
+
+        let arguments: Vec<String> = args
+            .iter()
+            .map(|e| {
+                e.as_any()
+                    .downcast_ref::<Literal>()
+                    .map(|literal| match literal.value() {
+                        ScalarValue::Utf8(Some(s)) => s.clone(),

Review Comment:
   It would be really nice to avoid this clone here -- if we changed the 
signature to `&[Option<&str>]` I think that would be easier to use and more 
efficient



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to