The current Ballista Python bindings [1] were created by cloning the
DataFusion Python bindings and then making some modifications. The
resulting codebase proved to be challenging to maintain and has not been
maintained for almost a year. This repository contains around 1,100 lines
of Rust code.

I propose that we archive this repository and adopt a new Python client
that only exposes SQL capabilities rather than providing both SQL and
DataFrame APIs. I have a PR [2] up for a new client, and this only contains
75 lines of Rust code. This new client uses the datafusion-python crate as
a dependency rather than duplicating code.

My hope is that this much leaner implementation will be easier to maintain
and keep up-to-date with Ballista releases. We can add the DataFrame API in
the future as a thin wrapper around the datafusion-python dependency if the
project gains enough traction.

If there are no objections, I will go ahead and archive the old repository
in the next week or two (and update the README to point to the new client).

Thanks,

Andy.

[1] https://github.com/apache/arrow-ballista-python

[2] https://github.com/apache/arrow-ballista/pull/970

Reply via email to