jkosh44 commented on code in PR #14532: URL: https://github.com/apache/datafusion/pull/14532#discussion_r1949114839
########## datafusion/expr-common/src/signature.rs: ########## @@ -227,25 +226,13 @@ impl Display for TypeSignatureClass { #[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Hash)] pub enum ArrayFunctionSignature { - /// Specialized Signature for ArrayAppend and similar functions - /// The first argument should be List/LargeList/FixedSizedList, and the second argument should be non-list or list. - /// The second argument's list dimension should be one dimension less than the first argument's list dimension. - /// List dimension of the List/LargeList is equivalent to the number of List. - /// List dimension of the non-list is 0. - ArrayAndElement, - /// Specialized Signature for ArrayPrepend and similar functions - /// The first argument should be non-list or list, and the second argument should be List/LargeList. - /// The first argument's list dimension should be one dimension less than the second argument's list dimension. - ElementAndArray, - /// Specialized Signature for Array functions of the form (List/LargeList, Index+) - /// The first argument should be List/LargeList/FixedSizedList, and the next n arguments should be Int64. - ArrayAndIndexes(NonZeroUsize), - /// Specialized Signature for Array functions of the form (List/LargeList, Element, Optional Index) - ArrayAndElementAndOptionalIndex, - /// Specialized Signature for ArrayEmpty and similar functions - /// The function takes a single argument that must be a List/LargeList/FixedSizeList - /// or something that can be coerced to one of those types. - Array, + /// A function takes at least one List/LargeList/FixedSizeList argument. + Array { + /// A full list of the arguments accepted by this function. + arguments: Vec<ArrayFunctionArgument>, + /// Whether any of the input arrays are modified. + mutability: ArrayFunctionMutability, Review Comment: > and the return type is determined by the mutability setting. Isn't it 🤔 ? I'm still unfamiliar with much of the code base, so please take everything I'm saying with a grain of salt. The mutability of the function is only ever looked at by the `get_valid_types()` function, which is described in the code as "Returns a Vec of all possible valid argument types for the given signature.". https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/expr/src/type_coercion/functions.rs#L352-L363 From that description, I would conclude that mutability determines the accepted argument types, **_not_** the return type. However, `ScalarUDFImpl` has a couple of trait functions, like `fn return_type(&self, arg_types: &[DataType]) -> Result<DataType>`, that determine the return type as a function of the argument types. https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/expr/src/udf.rs#L540-L596 If we take a look at some of the implementations of `return_type()` for array functions, many of them blindly pass through the argument type of the input array. https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/extract.rs#L396-L398 https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/extract.rs#L704-L706 https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/extract.rs#L811-L813 https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/concat.rs#L108-L110 https://github.com/jkosh44/datafusion/blob/53b7ae53af30cc7b8734a6c292cc3e04a993afdc/datafusion/functions-nested/src/concat.rs#L196-L198 So by modifying the accepted argument types we are indirectly modifying the return types. > Instead of modeling "mutability," we can explicitly define the desired type in the function signature. This type can be either List or FixedSizeList, and we coerce the input accordingly It might be a better approach to not modify the accepted argument types (i.e. don't convert `FixedSizeList` to `List` in `get_valid_types()`), and instead move the logic to `return_type()`. Then functions can be explicit about not returning `FixedSizeList`s. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org