Thank you Jaiyu. I tested the package out and was able to install and run the example from the readme (though I think the examples needs updating).
I see no reason not to publish this Here is what I did: Ran the following command to get it to install (cargo culting from a stack overflow post): pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple datafusion==0.5.0 Some things I noticed: 1. The example[1] on the readme had errors Python 3.9.10 (main, Jan 15 2022, 11:48:04) [Clang 13.0.0 (clang-1300.0.29.3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import datafusion >>> import pyarrow >>> >>> >>> # an alias >>> f = datafusion.functions Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: module 'datafusion' has no attribute 'functions' It looks like the `functions` module is no longer defined, and `col` is defined directly in the datafusion module itself. The following worked> >>> df = df.select(datafusion.col("a") + datafusion.col("b")) >>> result = df.collect()[0] >>> result.column(0) <pyarrow.lib.Int64Array object at 0x12972e460> [ 5, 7, 9 ] Andrew [1] https://github.com/datafusion-contrib/datafusion-python#how-to-use-it On Wed, Mar 9, 2022 at 7:18 PM Jiayu Liu <ji...@hey.com.invalid> wrote: > Andrew, I've added the package to testpypi at > https://test.pypi.org/project/datafusion/0.5.0/ so for anyone interested > in trying out, > > pip install -i https://test.pypi.org/simple/ datafusion==0.5.0 > > is the easiest way. You can uninstall that after trial. > > Andy I think I'd like to release this as is if no bugs or issues found. > If you'd want to add more features feel free to add in this week or next > and I'll do another release (0.5.1) because it's not breaking change. > The current minor version release is mostly due to maturin upgrade. > > On March 10, 2022, Andy Grove <andygrov...@gmail.com> wrote: > > I noticed that PyDataFrame is missing a number of methods that we have > > in > > Rust DataFrame so I am planning on putting up a PR this evening to add > > some > > of these. It would be good to get these in prior to releasing. > > > > On Wed, Mar 9, 2022 at 2:37 PM Andrew Lamb <al...@influxdata.com> > > wrote: > > > > > Hi Jaiyu -- thanks for leading this effort -- I would be happy to > > help test > > > this release -- do you have any suggestions about how to do so (i.e. > > > instructions for installing the wheel, what commands to try, etc?) > > > > > > On Wed, Mar 9, 2022 at 10:14 AM Jiayu Liu <ji...@hey.com.invalid> > > wrote: > > > > > > > Thanks Andy, I had the same confusion while drafting this email > > but I > > > > figured better safe than sorry. Now that it's clear, (to everyone > > > > reading this), please ignore the voting part. Given the likelihood > > that > > > > someone are using this so I'll give it a few days before doing the > > > > official publish. > > > > > > > > On March 9, 2022, Andy Grove <andygrov...@gmail.com> wrote: > > > > > Hi Jiayu, > > > > > > > > > > I'm very interested in seeing an updated release for the Python > > > > > bindings > > > > > but I am a bit confused about the governance on the project. I > > have > > > > > been > > > > > away from the project for a while so I may have missed > > something, but > > > > > I > > > > > don't think any of the repos under datafusion-contrib are under > > Apache > > > > > Arrow governance? Given that, I am not sure that it makes sense > > to go > > > > > through the Apache voting process here? I think the community is > > free > > > > > to > > > > > release at will or implement its own governance for release > > votes. > > > > > > > > > > Thanks, > > > > > > > > > > Andy. > > > > > > > > > > > > > > > > > > > > On Wed, Mar 9, 2022 at 6:54 AM Jiayu Liu <ji...@hey.com.invalid> > > > > > wrote: > > > > > > > > > > > Greetings Arrow dev community, > > > > > > > > > > > > I am not sure if voting will still be needed in the future but > > I'd > > > > > like > > > > > > to propose a release of Apache Arrow Datafusion's Python > > binding > > > > > (which > > > > > > now lives in https://github.com/datafusion-contrib/datafusion- > > > > > python) > > > > > > version 0.5.0. > > > > > > > > > > > > The release candidate is based on commit [1], the same commit > > is on > > > > > a > > > > > > pull request at [2], and you can download the pre-built wheel > > files > > > > > via > > > > > > link [3], which contains 3 wheel files (win, macOS, and > > manylinux), > > > > > > along with 1 source tarball. > > > > > > > > > > > > Only votes from PMC members are binding but all members of the > > > > > community > > > > > > are encouraged to test the release and vote with "(non- > > binding)". > > > > > > > > > > > > [ ] +1 Release this as 0.5.0 > > > > > > [ ] +0 > > > > > > [ ] -1 Do not release this as 0.5.0 because... > > > > > > > > > > > > > > > > > > [1]: https://github.com/datafusion-contrib/datafusion- > > > > > > python/commit/ef82a992af3f41a3bcd057a7d98834910cefabe5 > > > > > > [2]: https://github.com/datafusion-contrib/datafusion- > > python/pull/34 > > > > > > [3]: https://github.com/datafusion-contrib/datafusion- > > > > > > python/suites/5590592272/artifacts/181291336 > > > > > > > > > > > > > >