linhr opened a new issue, #20547: URL: https://github.com/apache/datafusion/issues/20547
### Is your feature request related to a problem or challenge? Right now `DefaultPhysicalPlanner::map_logical_node_to_physical()` calls `source_as_provider()` and returns an error if the `TableSource` inside `LogicalPlan::TableScan` is not a `DefaultTableSource` (which wraps a `TableProvider`). `TableSource` was introduced so that the logical planning doesn't have a dependency on `TableProvider`. `TableProvider` has a broader set of responsibilities that involve both logical planning and physical execution (`TableProvider::scan()`). However, I feel `TableSource` itself is a valuable abstraction for logical planning so it would be good if the user can customize the physical planning for it. In some use cases, the user may want to implement `TableSource` as purely a logical representation of data sources, without coupling the scanning logic in the same struct. If we allow custom `TableSource` for `LogicalPlan::TableScan`, custom data sources can benefit from logical optimization that involves filter pushdown, projection pruning, and fetch limit push down. ### Describe the solution you'd like A trait method `ExtensionPlanner::plan_table_scan()` would be helpful. The user can inject physical planning logic for `LogicalPlan::TableScan` containing custom `TableSource` implementations. If none of the registered extension planners returns the physical plan, we will fall back to the existing logic that assumes the `TableSource` wraps a `TableProvider` and continues the planning from there. ### Describe alternatives you've considered It is possible to work around this problem in the current setup. The idea is to first convert `LogicalPlan::TableScan` to `LogicalPlan::Extension` by traversing the logical plan tree, and then implement an `ExtensionPlanner` that converts the logical extension to the physical plan node. This is more boilerplate code to write (a `UserDefinedLogicalNode` that serves as the "bridge" and a logical plan rewriter). ### Additional context `source_as_provider()` is also used for logical plan Protobuf codec. For consistency, it might be good to support `TableSource`s that are not `TableProvider`s in logical plan Protobuf codec as well, but that would require some breaking changes in `LogicalExtensionCodec`, as well as adding the FFI support for `TableSource`. This is out of scope for this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
