jkosh44 commented on code in PR #14532:
URL: https://github.com/apache/datafusion/pull/14532#discussion_r1949114839


##########
datafusion/expr-common/src/signature.rs:
##########
@@ -227,25 +226,13 @@ impl Display for TypeSignatureClass {
 
 #[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)]
 pub enum ArrayFunctionSignature {
-    /// Specialized Signature for ArrayAppend and similar functions
-    /// The first argument should be List/LargeList/FixedSizedList, and the 
second argument should be non-list or list.
-    /// The second argument's list dimension should be one dimension less than 
the first argument's list dimension.
-    /// List dimension of the List/LargeList is equivalent to the number of 
List.
-    /// List dimension of the non-list is 0.
-    ArrayAndElement,
-    /// Specialized Signature for ArrayPrepend and similar functions
-    /// The first argument should be non-list or list, and the second argument 
should be List/LargeList.
-    /// The first argument's list dimension should be one dimension less than 
the second argument's list dimension.
-    ElementAndArray,
-    /// Specialized Signature for Array functions of the form (List/LargeList, 
Index+)
-    /// The first argument should be List/LargeList/FixedSizedList, and the 
next n arguments should be Int64.
-    ArrayAndIndexes(NonZeroUsize),
-    /// Specialized Signature for Array functions of the form (List/LargeList, 
Element, Optional Index)
-    ArrayAndElementAndOptionalIndex,
-    /// Specialized Signature for ArrayEmpty and similar functions
-    /// The function takes a single argument that must be a 
List/LargeList/FixedSizeList
-    /// or something that can be coerced to one of those types.
-    Array,
+    /// A function takes at least one List/LargeList/FixedSizeList argument.
+    Array {
+        /// A full list of the arguments accepted by this function.
+        arguments: Vec<ArrayFunctionArgument>,
+        /// Whether any of the input arrays are modified.
+        mutability: ArrayFunctionMutability,

Review Comment:
   > and the return type is determined by the mutability setting. Isn't it 🤔 ?
   
   I'm still unfamiliar with much of the code base, so please take everything 
I'm saying with a grain of salt. The mutability of the function is only ever 
looked at by the `get_valid_types()` function, which is described in the code 
as "Returns a Vec of all possible valid argument types for the given 
signature.".
   
   
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/expr/src/type_coercion/functions.rs#L352-L363
   
   From that description, I would conclude that mutability determines the 
accepted argument types, **_not_** the return type.
   
   However, `ScalarUDFImpl` has a couple of trait functions, like `fn 
return_type(&self, arg_types: &[DataType]) -> Result<DataType>`, that determine 
the return type as a function of the argument types.
   
   
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/expr/src/udf.rs#L540-L596
   
   If we take a look at some of the implementations of `return_type()` for 
array functions, many of them blindly pass through the argument type of the 
input array.
   
   
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/extract.rs#L396-L398
   
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/extract.rs#L704-L706
   
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/extract.rs#L811-L813
   
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/concat.rs#L108-L110
   
https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/concat.rs#L196-L198
   
   So by modifying the accepted argument types we are indirectly modifying the 
return types. 
   
   > Instead of modeling "mutability," we can explicitly define the desired 
type in the function signature. This type can be either List or FixedSizeList, 
and we coerce the input accordingly
   
   It might be a better approach to not modify the accepted argument types 
(i.e. don't convert `FixedSizeList` to `List` in `get_valid_types()`), and 
instead move the logic to `return_type()`. Then functions can be explicit about 
not returning `FixedSizeList`s.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to