kosiew opened a new pull request, #17773:
URL: https://github.com/apache/datafusion/pull/17773
## Which issue does this PR close?
* Closes #17760
## Rationale for this change
The planner sometimes needs to rewrite the physical representation of a
resolved column to a new schema/field while preserving nested-structure
semantics. The existing `CastExpr` doesn't carry the input/target field
metadata required to perform struct-aware casts (correct nested field ordering,
nullability, and null-padding for missing children). Adding a dedicated
`CastColumnExpr` lets the execution layer call into
`datafusion_common::nested_struct::cast_column` and guarantees casts behave
correctly for both array and scalar values.
This change is focused on execution-time casting semantics (schema-aware
casts) and does not attempt to modify planner/optimizer behaviour.
## What changes are included in this PR?
* Add new physical expression implementation:
`datafusion/physical-expr/src/expressions/cast_column.rs`.
* Defines `CastColumnExpr` struct which contains:
* `expr: Arc<dyn PhysicalExpr>` — child expression producing the value
to be cast.
* `input_field: FieldRef` — resolved input field metadata.
* `target_field: FieldRef` — desired output field metadata.
* `cast_options: CastOptions<'static>` — forwarded to `cast_column`.
* Implements `PhysicalExpr` for `CastColumnExpr`:
* `data_type` and `nullable` reflect the `target_field`.
* `evaluate` handles both `ColumnarValue::Array` and
`ColumnarValue::Scalar` by delegating to
`datafusion_common::nested_struct::cast_column`, then converting results back
to `ColumnarValue`.
* `children`, `with_new_children`, `fmt_sql`, `return_field` implemented
for planner/execution compatibility.
* Manual `PartialEq` and `Hash` impls to accommodate the `Arc<dyn
PhysicalExpr>` child.
* Export the new expression in
`datafusion/physical-expr/src/expressions/mod.rs` (`mod cast_column;` and `pub
use cast_column::CastColumnExpr;`).
* Add unit tests in `cast_column.rs` covering:
* Primitive array casts (Int32 -> Int64).
* Struct array where the target struct has a missing child (null-padding
behavior).
* Nested struct casts (preserve nested layout, add missing nested children
with nulls).
* Struct scalar casts (scalar -> casted struct scalar).
* Add module-level docs/comments describing intent and usage.
## Are these changes tested?
Yes — unit tests included in `cast_column.rs` exercise array & scalar cases,
nested structs, missing children, null-padding and simple primitive casts.
Tests added:
* `cast_primitive_array` — cast `Int32Array` to `Int64Array`.
* `cast_struct_array_missing_child` — source struct has fields `[a, b]`,
target struct requests `[a, c]` and expects `c` to be all-null.
* `cast_nested_struct_array` — nested struct casting where inner struct adds
a new child field that must be null-padded.
* `cast_struct_scalar` — casting a struct literal (scalar) and preserving
result as a `ScalarValue::Struct` with casted children.
All tests are Rust unit tests using `RecordBatch` and `Column`/`Literal`
helpers already available in the crate.
## Are there any user-facing changes?
* Public API: `CastColumnExpr` is exported from
`datafusion::physical_expr::expressions` and can be constructed/used by callers
who build physical expression trees. This is primarily intended for internal
planner/code that needs to perform schema-aware casting of resolved columns.
* No changes to existing SQL surface or planner rules are included in this
PR — it only adds a building block for use by other components (e.g. schema
rewriters or planner nodes).
* No behaviour change for existing code that doesn't use `CastColumnExpr`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]