jayzhan211 commented on code in PR #14440:
URL: https://github.com/apache/datafusion/pull/14440#discussion_r1961771860
##########
datafusion/expr-common/src/signature.rs:
##########
@@ -466,6 +551,186 @@ fn get_data_types(native_type: &NativeType) ->
Vec<DataType> {
}
}
+/// Represents type coercion rules for function arguments, specifying both the
desired type
+/// and optional implicit coercion rules for source types.
+///
+/// # Examples
+///
+/// ```
+/// use datafusion_expr_common::signature::{Coercion, TypeSignatureClass};
+/// use datafusion_common::types::{NativeType, logical_binary, logical_string};
+///
+/// // Exact coercion that only accepts timestamp types
+/// let exact = Coercion::new_exact(TypeSignatureClass::Timestamp);
+///
+/// // Implicit coercion that accepts string types but can coerce from binary
types
+/// let implicit = Coercion::new_implicit(
+/// TypeSignatureClass::Native(logical_string()),
+/// vec![TypeSignatureClass::Native(logical_binary())],
+/// NativeType::String
+/// );
+/// ```
+///
+/// There are two variants:
+///
+/// * `Exact` - Only accepts arguments that exactly match the desired type
+/// * `Implicit` - Accepts the desired type and can coerce from specified
source types
+#[derive(Debug, Clone, Eq, PartialOrd)]
+pub enum Coercion {
+ /// Coercion that only accepts arguments exactly matching the desired type.
+ Exact {
+ /// The required type for the argument
+ desired_type: TypeSignatureClass,
+ },
+
+ /// Coercion that accepts the desired type and can implicitly coerce from
other types.
+ Implicit {
+ /// The primary desired type for the argument
+ desired_type: TypeSignatureClass,
+ /// Rules for implicit coercion from other types
+ implicit_coercion: ImplicitCoercion,
+ },
Review Comment:
> Let's consider example of substr(s, i) function.
The call to substr should succeed for i being any integer type coercible to
UInt64 or Int64.
You’ve defined that the second argument of substr can be any integer type
coercible to Int64. Isn’t this part of the function definition? By doing so,
the function knows what coercion is needed. However, it's not enough to just
make this definition possible. If we want to allow coercion from a string
integer type or if we only expect coercion to Int32, those options should be
possible as well. Given this, we can't decide how coercion should happen
without being informed by the user or the function definition
> For example, integer types should be coercible to broader integer types
the same way regardless whether it's in context of UNION ALL, EXCEPT, a
function call, or an operator.
For internal DataFusion use cases, this rule works well. However, from an
extensibility perspective, restricting coercion rules reduces flexibility. What
if they don't want broader integer type for some specific workflow because they
know the max value is in i32? We should prioritize maximum flexibility while
also providing built-in options for general use cases to ensure ease of use.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]