adriangb commented on issue #14993: URL: https://github.com/apache/datafusion/issues/14993#issuecomment-3049060554
> I was thinking about this -- it isn't clear to me how to represent a pushdown of a subfield of a struct in a general way other than an expression like `extract(struct_col, "field_path")` > > The leaf column stuff is a parquet type system specific thing that doesn't map directly to Arrrow data types, etc Yes I've thought about that as well. I think the only sane way to do it is to pass `extract(struct_col, "field_path")` as part of the projection into (eventually, after a couple hoops) `ParquetOpener`. Then `ParquetOpener` has to decide for each `Arc<dyn PhysicalExpr>` how to build the right `ProjectionMask` + rewritten `Arc<dyn PhysicalExpr>` to evaluate. For example, if it it gets `[extract(struct_col, "field_path")]` as the projections it might rewrite that to `Column(0)` and use the projection mask `ProjectionMask::leafs(1)` (assuming 1 is the right leaf for the field `"field_path"`). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org