paleolimbot commented on code in PR #21714:
URL: https://github.com/apache/datafusion/pull/21714#discussion_r3183459223
##########
datafusion/expr/src/logical_plan/builder.rs:
##########
@@ -272,8 +272,10 @@ impl LogicalPlanBuilder {
let n_cols = values[0].len();
let mut fields = ValuesFields::new();
for j in 0..n_cols {
- let field_type = schema.field(j).data_type();
- let field_nullable = schema.field(j).is_nullable();
+ let field = schema.field(j);
+ let field_type = field.data_type();
+ let field_nullable = field.is_nullable();
+ let field_metadata = FieldMetadata::new_from_field(field);
Review Comment:
This can use:
https://github.com/apache/datafusion/blob/fa9ada36871074b622f3ca67fcdaf34d7a1efdbc/datafusion/common/src/metadata.rs#L75-L124
...which will not silently combine two expr types with different extension
types (and will give a reasonable error if somebody tries to do this)
##########
datafusion/sql/src/statement.rs:
##########
@@ -530,11 +531,18 @@ impl<S: ContextProvider> SqlToRel<'_, S> {
.iter()
.zip(input_fields)
.map(|(field, input_field)| {
- cast(
- col(input_field.name()),
- field.data_type().clone(),
- )
- .alias(field.name())
+ let metadata =
+
FieldMetadata::new_from_field(field.as_ref());
+ let alias_metadata = if
metadata.is_empty() {
+ None
+ } else {
+ Some(metadata)
+ };
+ Expr::Cast(Cast::new_from_field(
+ Box::new(col(input_field.name())),
+ Arc::clone(field),
+ ))
+ .alias_with_metadata(field.name(),
alias_metadata)
Review Comment:
I am not sure that blind metadata propagation is an improvement here (a cast
from UUID to String should not propagate metadata, and would result in
confusing errors).
Probably what you want here is to check that the type of the field and the
input field are equal using `check_metadata_with_storage_equal()` and avoid the
cast if this is the case. If not, just create the `Expr::Cast` with the target
field and let a future improvement implement the cast properly.
##########
datafusion/sqllogictest/test_files/sql_extension_types.slt:
##########
@@ -0,0 +1,137 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+
+# http://www.apache.org/licenses/LICENSE-2.0
+
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Tests for built-in mapping of SQL types to Arrow canonical extension types.
+# See: https://arrow.apache.org/docs/format/CanonicalExtensions.html
+
+#######
+# UUID Extension Type
+#######
+
+# CAST to UUID preserves extension metadata
+query ?T
+SELECT CAST(arrow_cast(X'00010203040506070809000102030506',
'FixedSizeBinary(16)') AS UUID),
Review Comment:
While this exact example makes sense, in general this kind of thing should
error (e.g., I think in your implementation you could cast an integer to a UUID
and it would not error and give a completely invalid result...if I'm wrong, it
should at least be tested here).
##########
datafusion/sql/src/planner.rs:
##########
@@ -658,6 +661,21 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
// If no type_planner can handle this type, use the default conversion
match sql_type {
+ // Canonical Arrow extension types
+ SQLDataType::Uuid => Ok(Arc::new(
Review Comment:
Here I think we can use the `try_new()` implementations (which do what
Tobias suggested under the hood):
https://github.com/apache/datafusion/blob/fa9ada36871074b622f3ca67fcdaf34d7a1efdbc/datafusion/common/src/types/canonical_extensions/uuid.rs#L39-L48
We may want to gate this on whether or not these are registered with the
session context (i.e., whether or not `session.extension_types()` can resolve
"arrow.uuid").
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]