ODBC and JDBC do not specify a wire protocol. So, while the client APIs are
definitely row-based, any particular driver could use a protocol that is based
on Arrow data.
There is immense investment in ODBC and JDBC drivers, and they handle complex
cases such as connection pooling, statement can
I think this is a great initiative.
If I understand correctly, it would open up Arrow for many more use cases
and allow for example to connect BI Tools like PowerBi, Tableau, etc to
DataFusion. I'll also try to make some time to support this.
Thanks!
Sven
On Mon, Sep 28, 2020 at 3:49 AM Andy Grove
I didn't get a chance yet to really read this thread in detail but I am
definitely very interested in this conversation and will make time this
week to add my thoughts.
Thanks,
Andy.
On Sun, Sep 27, 2020, 4:01 PM Adam Lippai wrote:
> Hi Neville,
>
> yes, my concerns against common row based DB
Hi Neville,
yes, my concerns against common row based DB APIs is that I use
Arrow/Parquet for OLAP too.
What https://turbodbc.readthedocs.io/en/latest/ (python) or
https://github.com/pacman82/odbc-api#state (rust) does is that they read
large blocks of data instead of processing rows one-by-one, b
Thanks for the feedback
My interest is mainly in the narrow usecase of reading and writing batch
data,
so I wouldn't want to deal with producing and consuming rows per se.
Andy has worked on RDBC (https://github.com/tokio-rs/rdbc) for the
row-based or OLTP case,
and I'm considering something more
One more universal approach is to use ODBC, this is a recent Rust
conversation (with example) on the topic:
https://github.com/Koka/odbc-rs/issues/140
Honestly I find the Python DB API too simple, all it provides is a
row-by-row API. I miss four things:
- Batched or bulk processing both for da
That would be awesome! I agree with this, and would be really useful, as it
would leverage all the goodies that RDMS have wrt to transitions, etc.
I would probably go for having database-specifics outside of the arrow
project, so that they can be used by other folks beyond arrow, and keep the
arro
hi Neville,
In Python we have something called the DB API 2.0 (PEP 249) that
defines an API standard for SQL databases in Python, including an
expectation around the data format of result sets. It sounds like you
need to create the equivalent of that in Rust with Arrow as the API /
format returned
Hi Arrow developers
I would like to gauge the appetite for an Arrow SQL connector that:
* Reads and writes Arrow data to and from SQL databases
* Reads tables and queries into record batches, and writes batches to
tables (either append or overwrite)
* Leverages binary SQL formats where available