alamb commented on issue #7845: URL: https://github.com/apache/datafusion/issues/7845#issuecomment-2368900548
> somehow get filter pushdown / late materialization to work based on the result of a UDF so some columns aren't decompressed (or even aren't fetched) unless they're needed This seems like the right idea to pursue to me The `ParquetExec` already can push filtering down into the scan via ``` datafusion.execution.parquet.pushdown_filters true ``` (we have it on in InfluxData) But it is not turned on by default as it isn't faster for all cases yet: https://github.com/apache/datafusion/pull/12524 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
