martin-g commented on code in PR #18971:
URL: https://github.com/apache/datafusion/pull/18971#discussion_r2570027135
##########
datafusion/core/src/datasource/listing_table_factory.rs:
##########
@@ -190,6 +190,12 @@ impl TableProviderFactory for ListingTableFactory {
.with_definition(cmd.definition.clone())
.with_constraints(cmd.constraints.clone())
.with_column_defaults(cmd.column_defaults.clone());
+
+ // Pre-warm statistics cache if collect_statistics is enabled
+ if session_state.config().collect_statistics() {
+ let _ = table.list_files_for_scan(state, &[], None).await?;
Review Comment:
Should errors in the pre-warming be propagated ?
Maybe handle/ignore failures locally ?!
##########
datafusion/core/src/datasource/listing_table_factory.rs:
##########
@@ -190,6 +190,12 @@ impl TableProviderFactory for ListingTableFactory {
.with_definition(cmd.definition.clone())
.with_constraints(cmd.constraints.clone())
.with_column_defaults(cmd.column_defaults.clone());
+
+ // Pre-warm statistics cache if collect_statistics is enabled
+ if session_state.config().collect_statistics() {
+ let _ = table.list_files_for_scan(state, &[], None).await?;
Review Comment:
> 2\. Also is no limit fine ?
I think it should have a limit.
And maybe it should be done in the background.
If there are many files this may slow down things.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]