matthewgapp opened a new issue, #10889:
URL: https://github.com/apache/datafusion/issues/10889
### Is your feature request related to a problem or challenge?
Accessing a `TableProviders` schema is a sync function call. This means that
the `TableProvider` must know its schema before construction.
DataFusion recently introduced `TableFunctionImpl`, which allows users to
define a function to create a `TableProvider.` Unfortunately, this `call`
method is sync, meaning that the user-defined table function must know its
schema upfront in a non-blocking way. This isn't possible when implementing
TableProviders, which might infer their schema async, like an HTTP connector
that can connect to arbitrary sources with payloads only known once the
response is streaming in.
### Describe the solution you'd like
I propose we make the `call` method async to allow for async schemas and
thus async table provider construction.
current code
```rust
use super::TableProvider;
use datafusion_common::Result;
use datafusion_expr::Expr;
use std::sync::Arc;
/// A trait for table function implementations
pub trait TableFunctionImpl: Sync + Send {
/// Create a table provider
fn call(&self, args: &[Expr]) -> Result<Arc<dyn TableProvider>>;
}
/// A table that uses a function to generate data
pub struct TableFunction {
/// Name of the table function
name: String,
/// Function implementation
fun: Arc<dyn TableFunctionImpl>,
}
impl TableFunction {
/// Create a new table function
pub fn new(name: String, fun: Arc<dyn TableFunctionImpl>) -> Self {
Self { name, fun }
}
/// Get the name of the table function
pub fn name(&self) -> &str {
&self.name
}
/// Get the function implementation and generate a table
pub fn create_table_provider(&self, args: &[Expr]) -> Result<Arc<dyn
TableProvider>> {
self.fun.call(args)
}
}
```
where the new trait would looks something like
```rust
/// A trait for table function implementations
pub trait TableFunctionImpl: Sync + Send {
/// Create a table provider
async fn call(&self, args: &[Expr]) -> Result<Arc<dyn TableProvider>>;
}
```
### Describe alternatives you've considered
I've worked around this by creating the table outside of data fusion, but I
would prefer to use the table functions to achieve the same thing.
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]