gruuya opened a new issue, #11865: URL: https://github.com/apache/datafusion/issues/11865
### Is your feature request related to a problem or challenge? Presently the `information_schema.tables` builder serially loads all tables when constructing the output https://github.com/apache/datafusion/blob/bddb6415a50746d2803dd908d19c3758952d74f9/datafusion/core/src/catalog_common/information_schema.rs#L93-L102 In our case those are Delta tables with the implications that: - Each load (likely) results in network request(s) to an object store, so hitting many of them in series will result in slow-down (see https://github.com/splitgraph/seafowl/issues/589 for an example) - Since we already have the table name the only reason table loading happens is to fetch the table type, which in case of Delta tables is hard-coded https://github.com/delta-io/delta-rs/blob/aa28d730e1d69ed419f2dc22404c5bbab8e98647/crates/core/src/delta_datafusion/mod.rs#L700 ### Describe the solution you'd like It seems that loading the full `TableProvider` for each table is an overkill since we only ever want to know the table types. In addition it would be preferable to have a bulk load method, in case when the table type is not hard-coded and must be fetched from an external source. In principle this could be achieved by having a method on the schema provider that returns `Vec<TableSource>`, since `TableSource` also has the table type. This gets further complicated having in mind that `information_schema.columns` and `information_schema.views` also do this serial table loading, but in their case it's table schema and table definition that's fetched. `TableSource` does have the former, but not the later. Moreover, to get a Delta table's schema you really need to [load](https://github.com/delta-io/delta-rs/blob/aa28d730e1d69ed419f2dc22404c5bbab8e98647/crates/core/src/table/mod.rs#L316) it (unless you also keep track of it someplace else) which brings us back to the initial problem. ### Describe alternatives you've considered If I know that all the tables are Delta tables make a custom `information_schema.tables` builder that just returns the hard-coded table type, though this doesn't help with `columns` and `views`. ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
