adriangb commented on issue #14993:
URL: https://github.com/apache/datafusion/issues/14993#issuecomment-3033503866

   I started looking into this and where it gets messy is:
   1. Partition columns. I think this needs a rethink. I suggest pushing 
partition column generation down into the actual scan of the data using 
projection pushdown, then everything above that doesn't need to special case 
them. But that might mean loosing this nifty optimization: 
https://github.com/apache/datafusion/blob/5a0ddbf00ed1336079444cb9217ab2069b6780fc/datafusion/datasource/src/file_scan_config.rs#L1140-L1143
   2. Computing all of the equivalence properties, etc. The good news is that I 
think this will all come out simpler: we should essentially re-use what 
`ProjectionExec` does instead of having a different path for when the 
projection is a `Vec<usize>`.
   3. There's going to be a good amount of breaking changes needed for folks 
using `FileScanConfig` & co directly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to