neilconway commented on code in PR #20436:
URL: https://github.com/apache/datafusion/pull/20436#discussion_r2855212342
##########
datafusion/functions/src/string/concat.rs:
##########
@@ -88,37 +88,33 @@ impl ScalarUDFImpl for ConcatFunc {
&self.signature
}
+ /// Match the return type to the input types to avoid unnecessary casts. On
+ /// mixed inputs, prefer Utf8View; prefer LargeUtf8 over Utf8 to avoid
+ /// potential overflow on LargeUtf8 input.
fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> {
use DataType::*;
- let mut dt = &Utf8;
- arg_types.iter().for_each(|data_type| {
- if data_type == &Utf8View {
- dt = data_type;
- }
- if data_type == &LargeUtf8 && dt != &Utf8View {
- dt = data_type;
- }
- });
-
- Ok(dt.to_owned())
+ if arg_types.contains(&Utf8View) {
+ Ok(Utf8View)
+ } else if arg_types.contains(&LargeUtf8) {
Review Comment:
Interesting point. I believe the typical precedence today is Utf8View >
LargeUtf8 > Utf8, partly on the grounds that "StringArray to StringViewArray is
cheap but not vice versa". I can see arguments for both sides; if we want to
reconsider this, seems like a distinct issue?
https://github.com/apache/datafusion/blob/e894a03bea638e35677eaf27876966013dd64bf4/datafusion/expr-common/src/type_coercion/binary.rs#L1663-L1667
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]