alamb commented on issue #18909: URL: https://github.com/apache/datafusion/issues/18909#issuecomment-3577837262
So the stack trace above shows that the "FilesStatisticsCache" is being consulted and I suspect being used in subsequent runs https://github.com/apache/datafusion/blob/3c21b546a9acf9922229220d3ceca91a945cbf46/datafusion/catalog-listing/src/table.rs#L181-L180 However, the cache doesn't appear to be populated as part of CREATE TABLE for some reason I looked into the code and it does appear that the statistics cache is attached directly to the ListingTable and instantiated (but not populated) when the table is created the first time: https://github.com/apache/datafusion/blob/3c21b546a9acf9922229220d3ceca91a945cbf46/datafusion/catalog-listing/src/table.rs#L227-L226 My next steps will be: 1. File a ticket that explains this in more detail 2. Try and wire up the statistics cache correctly so that DataFusion behaves the way the comments say it should -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
