findepi commented on code in PR #15110:
URL: https://github.com/apache/datafusion/pull/15110#discussion_r1989510726


##########
datafusion/optimizer/src/simplify_expressions/unwrap_cast.rs:
##########
@@ -177,6 +192,45 @@ pub(super) fn 
is_cast_expr_and_support_unwrap_cast_in_comparison_for_inlist<
     true
 }
 
+fn cast_literal_to_type_with_op(
+    lit_value: &ScalarValue,
+    target_type: &DataType,
+    op: Operator,
+) -> Option<ScalarValue> {
+    macro_rules! cast_or_else_return_none {
+        ($value:ident, $ty:expr) => {{
+            let opts = arrow::compute::CastOptions {
+                safe: false,
+                format_options: Default::default(),
+            };
+            let array = ScalarValue::to_array($value).ok()?;
+            let casted = arrow::compute::cast_with_options(&array, &$ty, 
&opts).ok()?;
+            let scalar = ScalarValue::try_from_array(&casted, 0).ok()?;
+            Some(scalar)
+        }};
+    }
+
+    match (op, lit_value) {
+        (
+            Operator::Eq | Operator::NotEq,
+            ScalarValue::Utf8(Some(_))
+            | ScalarValue::Utf8View(Some(_))
+            | ScalarValue::LargeUtf8(Some(_)),
+        ) => match target_type {
+            DataType::Int8 => cast_or_else_return_none!(lit_value, 
DataType::Int8),
+            DataType::Int16 => cast_or_else_return_none!(lit_value, 
DataType::Int16),
+            DataType::Int32 => cast_or_else_return_none!(lit_value, 
DataType::Int32),
+            DataType::Int64 => cast_or_else_return_none!(lit_value, 
DataType::Int64),
+            DataType::UInt8 => cast_or_else_return_none!(lit_value, 
DataType::UInt8),
+            DataType::UInt16 => cast_or_else_return_none!(lit_value, 
DataType::UInt16),
+            DataType::UInt32 => cast_or_else_return_none!(lit_value, 
DataType::UInt32),
+            DataType::UInt64 => cast_or_else_return_none!(lit_value, 
DataType::UInt64),
+            _ => None,
+        },
+        _ => None,

Review Comment:
   As it's written, and as the function is named, this looks like generic 
approach that will work many different source/target type pairs, we just need 
to add logic.
   However, the exact checks necessary for soundness will likely differ.
   
   For casts between exact numbers (ints and decimals) and string types, the 
https://github.com/apache/datafusion/pull/15110#issuecomment-2714474220 looks 
like a sound plan.
   Plus, we could also simplify impossible target literals like this
   - `cast(an_int as string) = 'abc'` or `cast(an_int as string) = '007'` to 
`if(an_int is null, null, false)`
   
   For casts between numbers and numbers, we likely need to recognize max 
addressible values, and cast impreciseness, especially when we want to handle 
`<`, `<=`, `>`, `>=` operators.
   
https://github.com/trinodb/trino/blob/a870656f406b0f76e97c740659a58554d472994d/core/trino-main/src/main/java/io/trino/sql/planner/iterative/rule/UnwrapCastInComparison.java#L199-L354
 might serve as a good reference. Note how it handles NaN, max values, lossy 
round-tripping, sometimes converting `<` to `<=` as necessary.
   
   For e.g cast from timestamp to date, the logic will likely be different again
   
https://github.com/trinodb/trino/blob/a870656f406b0f76e97c740659a58554d472994d/core/trino-main/src/main/java/io/trino/sql/planner/iterative/rule/UnwrapCastInComparison.java#L357-L389.
   
   With all this in mind, I'd suggest structuring this code so it's clear it 
addresses only the exact number and string case, by checking expr and literal 
types and delegating to a function that has "int" and "utf8" or "string" in the 
name.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to