friendlymatthew commented on code in PR #20854:
URL: https://github.com/apache/datafusion/pull/20854#discussion_r2913765088
##########
datafusion/datasource-parquet/src/row_filter.rs:
##########
@@ -449,7 +466,17 @@ impl TreeNodeVisitor<'_> for PushdownChecker<'_> {
{
let return_type = func.return_type();
if !DataType::is_nested(return_type) {
- if let Some(recursion) =
self.check_struct_field_column(column.name())
+ let field_path = args[1..]
+ .iter()
+ .filter_map(|arg| {
+
arg.as_any().downcast_ref::<Literal>().and_then(|lit| {
+ lit.value().try_as_str().flatten().map(|s|
s.to_string())
+ })
+ })
+ .collect();
Review Comment:
This silently skips non-literal or non-string arguments. In practice, this
is safe because `GetFieldFunc` always takes string literal field names, and the
simplifier `GetFieldFunc::simplify` flattens chained `get_field` calls into a
single call before we reach physical planning
##########
datafusion/datasource-parquet/src/row_filter.rs:
##########
@@ -330,16 +345,17 @@ impl<'schema> PushdownChecker<'schema> {
fn check_struct_field_column(
&mut self,
column_name: &str,
+ field_path: Vec<String>,
) -> Option<TreeNodeRecursion> {
- let idx = match self.file_schema.index_of(column_name) {
- Ok(idx) => idx,
- Err(_) => {
- self.projected_columns = true;
- return Some(TreeNodeRecursion::Jump);
- }
+ let Ok(idx) = self.file_schema.index_of(column_name) else {
+ self.projected_columns = true;
+ return Some(TreeNodeRecursion::Jump);
};
- self.required_columns.push(idx);
+ self.struct_field_accesses.push(StructFieldAccess {
Review Comment:
struct field accesses are tracked in a separate vec rather than being pushed
into `required_columns`
This is intentional since `required_columns` feeds into
`leaf_indices_for_roots` which expands a root index to all its leaves. By
keeping the struct accesses separate, we can resolve them to only the specific
leaves needed via `resolve_struct_field_leaves`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]