BlakeOrth commented on issue #17211: URL: https://github.com/apache/datafusion/issues/17211#issuecomment-3259806289
@alamb Yes, that is the case when the `list_all_files` method is called! https://github.com/apache/datafusion/blob/main/datafusion/datasource/src/url.rs#L237 Unfortunately, this method is called conditionally on a table scan, which means the underlying table structure effectively determines whether or not it's possible to utilize the existing cache infrastructure. My proposed solution to this issue is to use `list_all_files` in all cases, so the existing cache infrastructure can be leveraged regardless of the underlying table structure. Here's the code that shows the divergence in logic for when `list_all_files` is used, versus the other methods that are called: https://github.com/apache/datafusion/blob/f3941b207eeaa7768d840e17c32fa61f3b6fca71/datafusion/catalog-listing/src/helpers.rs#L417-L438 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org