adriangb commented on issue #20135:
URL: https://github.com/apache/datafusion/issues/20135#issuecomment-4489764202

   IIUC the plan we end up with is:
   
   ```rust
   FilterExec: projection=[other_col], filter=[file_row_index() > 3]
     DataSourceExec: projection=[other_col]
   ```
   
   Because by design this UDF can only be evaluated by `ParquetOpener`.
   
   That makes sense and is unfortunate.
   
   Incidentally the (otherwise unrelated) changes in 
https://github.com/apache/datafusion/pull/22144 / 
https://github.com/apache/datafusion/pull/22237 happen to fix this: they allow 
*any* filter to be pushed down into `ParquetOpener` because it doesn't have to 
be evaluated as a row filter.
   
   Maybe this is a fundamental limitation of the UDF approach? The other 
approach we explored (adding some sort of system column) requires invasive 
modifications to the concept of a schema itself if I remember correctly and 
that was the main con, but I'm open to re-exploring it.
   
   Either way neither of these seem blockers for 
https://github.com/apache/datafusion/pull/22026 and I still plan on merging 
that once 54 is released @mbutrovich


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to