adriangb commented on PR #20071: URL: https://github.com/apache/datafusion/pull/20071#issuecomment-3825107178
> > I think it should be much simpler than this: create a UDF and pass it through `ProjectionExprs::transform_exprs(|expr| expr.transform(|expr| // if expr is our ScalarUDF, replace with literal filename))` > > Thanks @adriangb - really appreciate the feedback! > > I've been working on implementing your approach where `input_file_name()` stays as a 0-arg scalar UDF, then `ProjectionOpener::open(partitioned_file)` does `expr.transform(...)` to replace the `ScalarFunctionExpr` with `Literal(Utf8(partitioned_file.location))`. > > Does this direction look right to you? Yes that's what I'm thinking. Both `ProjectionOpener` and `FileSource` implementations that don't use it (`ParquetOpener`) should do the rewrite. > One question on plumbing: in many plans (e.g. `SELECT input_file_name() FROM t WHERE a > 1`), a top `ProjectionExec` can't push through `FilterExec` without dropping columns needed by the filter. This causes the UDF to stay above the scan and hit the runtime error guard. > > For this first PR, which is better? > > 1. Minimal opener rewrite only: works only when projection naturally pushes into the scan > > 2. Include analyzer/optimizer/planner glue: marks scan as needing the reserved column, injects UDF into file source projection so it works with filters/sorts/limits This is precisely what I was referring to in https://github.com/apache/datafusion/issues/6051#issuecomment-3792652539. analyzer/optimizer/planner to sort this out is very complex and error prone. I think let's not do that for now. You should make the default behavior null instead of an error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
