Hi all!

I'm wondering what people think of a possibility to extend DataFusion so as
to accommodate time-travel querying? This would work well with the new
table formats, particularly Iceberg and Delta Lake, where table versioning
is at the core of the protocol.

You can see some details in the issue I raised below[1], but the TLDR of
the work I see is:
1. extend sqlparser-rs to be aware of the `AS OF` clause (or something else
people prefer)
2. capture that information inside `TableFactor::Table
<https://github.com/sqlparser-rs/sqlparser-rs/blob/main/src/ast/query.rs#L650-L664>`
expression
3. then in DataFusion itself while building `SessionContextProvider` and
pre-populating the tables for a given query keep track of both the table
version and table name specified
4. this would also mean a breaking change in the `SchemaProvider::table`
along the lines of
```rust
async fn table(&self, name: &str, version: Option<TableVersion>) ->
Option<Arc<dyn TableProvider>>
```
which would allow the provider implementation to be version-aware

I'd be glad to commence work on this if there's consensus on the addition
of such a feature to DataFusion.

Cheers,
Marko

[1] https://github.com/apache/arrow-datafusion/issues/7292

Reply via email to