Hi Arrow developers I would like to gauge the appetite for an Arrow SQL connector that:
* Reads and writes Arrow data to and from SQL databases * Reads tables and queries into record batches, and writes batches to tables (either append or overwrite) * Leverages binary SQL formats where available (e.g. PostgreSQL format is relatively easy and well-documented) * Provides a batch interface that abstracts away the different database semantics, and exposes a RecordBatchReader ( https://docs.rs/arrow/1.0.1/arrow/record_batch/trait.RecordBatchReader.html), and perhaps a RecordBatchWriter * Resides in the Rust repo as either an arrow::sql module (like arrow::csv, arrow::json, arrow::ipc) or alternatively is a separate crate in the workspace (*arrow-sql*?) I would be able to contribute a Postgres reader/writer as a start. I could make this a separate crate, but to drive adoption I would prefer this living in Arrow, also it can remain updated (sometimes we reorganise modules and end up breaking dependencies). Also, being developed next to DataFusion could allow DF to support SQL databases, as this would be yet another datasource. Some questions: * Should such library support async, sync or both IO methods? * Other than postgres, what other databases would be interesting? Here I'm hoping that once we've established a suitable API, it could be easier to natively support more database types. Potential concerns: * Sparse database support It's a lot of effort to write database connectors, especially if starting from scratch (unlike with say JDBC). What if we end up supporting 1 or 2 database servers? Perhaps in that case we could keep the module without publishing it to crates.io until we're happy with database support, or even its usage. * Dependency bloat We could feature-gate database types to reduce the number of dependencies if one only wants certain DB connectors * Why not use Java's JDBC adapter? I already do this, but sometimes if working on a Rust project, creating a separate JVM service solely to extract Arrow data is a lot of effort. I also don't think it's currently possible to use the adapter to save Arrow data in a database. * What about Flight SQL extensions? There have been discussions around creating Flight SQL extensions, and the Rust SQL adapter could implement that and co-exist well. >From a crate dependency, *arrow-flight* depends on *arrow*, so it could also depend on this *arrow-sql* crate. Please let me know what you think Regards Neville