Hi,

I've released DataFusion C, DataFusion GLib and Red
DataFusion 10.0.0 that are based on Apache Arrow DataFusion
10.0.0.
(I'll release DataFusion C, DataFusion GLib and Red
DataFusion 11.0.0 that are based on Apache Arrow DataFusion
11.0.0 sometime soon.)

DataFusion C:
  https://datafusion-contrib.github.io/datafusion-c/latest/
DataFusion GLib:
  https://datafusion-contrib.github.io/datafusion-c/latest/glib/
Red DataFusion:
  https://github.com/datafusion-contrib/datafusion-ruby/

Apache Arrow DataFusion is written in Rust. DataFusion C
provides C API for Apache Arrow DataFusion. Its API only
uses the standard C. There is no external dependency.

DataFusion C supports Apache Arrow C data interface. It
means that you can register your Apache Arrow data in memory
into Apache Arrow DataFusion and you can retrieve data in
memory returned by Apache Arrow DataFusion as Apache Arrow
data without external library. Because Apache Arrow C data
interface just uses the standard C ABI.

DataFusion C is suitable for creating language bindings of
Apache Arrow DataFusion because many languages have built-in
C support. Here are examples that use DataFusion C from
other languages:

Python with ctypes:
  
https://datafusion-contrib.github.io/datafusion-c/latest/example/sql.html#raw-c-api-from-python
Ruby with Fiddle:
  
https://datafusion-contrib.github.io/datafusion-c/latest/example/sql.html#raw-c-api-from-ruby

DataFusion GLib provides GLib API. DataFusion GLib is built
on top of the DataFusion C. DataFusion GLib is also suitable
for creating language bindings. Because DataFusion GLib
supports GObject Introspection that is a middleware to
generate language bindings dynamically:
  https://gi.readthedocs.io/en/latest/

DataFusion GLib is integrated with Apache Arrow GLib. You
can use convenient API than API provided by DataFusion
C. DataFusion C uses raw Apache Arrow C data interface but
DataFusion GLib uses Apache Arrow GLib objects instead of
Apache Arrow C data interface. DataFusion GLib uses Apache
Arrow C data interface internally but it hides the details
from users.

If a language supports GObject Introspection, you can
generate language binding with a few lines. Here are
examples that use DataFusion GLib from other languages:

Python with PyGObject:
  
https://datafusion-contrib.github.io/datafusion-c/latest/example/sql.html#glib-api-from-python
Ruby with gobject-introspection gem:
  
https://datafusion-contrib.github.io/datafusion-c/latest/example/sql.html#glib-api-from-ruby

There are binary packages of DataFusion C and DataFusion
GLib for Debian GNU/Linux, Ubuntu and AlmaLinux:
  https://datafusion-contrib.github.io/datafusion-c/latest/install.html

They are provided from
https://apache.jfrog.io/artifactory/arrow/ that are also
used by Apache Arrow C++ and Apache Arrow GLib. It means
that you can install Apache Arrow C++, Apache Arrow GLib,
DataFusion C and DataFusion GLib from the same APT/Yum
repositories.


Red DataFusion is a language bindings of Apache Arrow
DataFusion for Ruby. It's based on DataFusion GLib. This is
well integrated with Red Arrow.

There is another Apache Arrow DataFusion bindings for Ruby:
https://github.com/jychen7/arrow-datafusion-ruby

It only depends on Apache Arrow DataFusion. It doesn't
depend on Red Arrow. It doesn't use Apache Arrow C data
interface. It writes Rust codes to convert DataFusion
objects to Ruby objects.


If you're interesting in these projects, there are GitHub
Discussions for them:

  * DataFusion C/DataFusion GLib:
    https://github.com/datafusion-contrib/datafusion-c/discussions
  * Red DataFusion:
    https://github.com/datafusion-contrib/datafusion-ruby/discussions


Thanks,
-- 
kou

Reply via email to