kosiew commented on code in PR #20787:
URL: https://github.com/apache/datafusion/pull/20787#discussion_r2965272000
##########
datafusion/physical-expr/src/expressions/binary.rs:
##########
@@ -933,6 +933,18 @@ fn pre_selection_scatter(
}
fn concat_elements(left: &ArrayRef, right: &ArrayRef) -> Result<ArrayRef> {
+ if *left.data_type() == DataType::Binary && *right.data_type() ==
DataType::Binary {
+ // Cast Binary to Utf8 to validate UTF-8 encoding before concatenation
Review Comment:
Could this binary-to-UTF8 normalization live in a shared helper with the
`concat` UDF path?
Right now the operator has its own validation branch while the UDF has
separate binary handling, which makes it a little harder to keep error
semantics and future binary/string coercion changes aligned.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]