Thank you Jaiyu. I tested the package out and was able to install and run
the example from the readme (though I think the examples needs updating).

I see no reason not to publish this

Here is what I did:
Ran the following command to get it to install (cargo culting from a stack
overflow post):

pip install --index-url https://test.pypi.org/simple/ --extra-index-url
https://pypi.org/simple datafusion==0.5.0


Some things I noticed:


1. The example[1] on the readme had errors


Python 3.9.10 (main, Jan 15 2022, 11:48:04)

[Clang 13.0.0 (clang-1300.0.29.3)] on darwin

Type "help", "copyright", "credits" or "license" for more information.

>>> import datafusion

>>> import pyarrow

>>>

>>>

>>> # an alias

>>> f = datafusion.functions

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

AttributeError: module 'datafusion' has no attribute 'functions'


It looks like the `functions` module is no longer defined, and `col` is
defined directly in the datafusion module itself. The following worked>


>>> df = df.select(datafusion.col("a") + datafusion.col("b"))

>>> result = df.collect()[0]

>>> result.column(0)

<pyarrow.lib.Int64Array object at 0x12972e460>

[

  5,

  7,

  9

]



Andrew



[1]  https://github.com/datafusion-contrib/datafusion-python#how-to-use-it

On Wed, Mar 9, 2022 at 7:18 PM Jiayu Liu <ji...@hey.com.invalid> wrote:

> Andrew, I've added the package to testpypi at
> https://test.pypi.org/project/datafusion/0.5.0/ so for anyone interested
> in trying out,
>
> pip install -i https://test.pypi.org/simple/ datafusion==0.5.0
>
> is the easiest way. You can uninstall that after trial.
>
> Andy I think I'd like to release this as is if no bugs or issues found.
> If you'd want to add more features feel free to add in this week or next
> and I'll do another release (0.5.1) because it's not breaking change.
> The current minor version release is mostly due to maturin upgrade.
>
> On March 10, 2022, Andy Grove <andygrov...@gmail.com> wrote:
> > I noticed that PyDataFrame is missing a number of methods that we have
> > in
> > Rust DataFrame so I am planning on putting up a PR this evening to add
> > some
> > of these. It would be good to get these in prior to releasing.
> >
> > On Wed, Mar 9, 2022 at 2:37 PM Andrew Lamb <al...@influxdata.com>
> > wrote:
> >
> > > Hi Jaiyu -- thanks for leading this effort -- I would be happy to
> > help test
> > > this release -- do you have any suggestions about how to do so (i.e.
> > > instructions for installing the wheel, what commands to try, etc?)
> > >
> > > On Wed, Mar 9, 2022 at 10:14 AM Jiayu Liu <ji...@hey.com.invalid>
> > wrote:
> > >
> > > > Thanks Andy, I had the same confusion while drafting this email
> > but I
> > > > figured better safe than sorry. Now that it's clear, (to everyone
> > > > reading this), please ignore the voting part. Given the likelihood
> > that
> > > > someone are using this so I'll give it a few days before doing the
> > > > official publish.
> > > >
> > > > On March 9, 2022, Andy Grove <andygrov...@gmail.com> wrote:
> > > > > Hi Jiayu,
> > > > >
> > > > > I'm very interested in seeing an updated release for the Python
> > > > > bindings
> > > > > but I am a bit confused about the governance on the project. I
> > have
> > > > > been
> > > > > away from the project for a while so I may have missed
> > something, but
> > > > > I
> > > > > don't think any of the repos under datafusion-contrib are under
> > Apache
> > > > > Arrow governance? Given that, I am not sure that it makes sense
> > to go
> > > > > through the Apache voting process here? I think the community is
> > free
> > > > > to
> > > > > release at will or implement its own governance for release
> > votes.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Andy.
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Mar 9, 2022 at 6:54 AM Jiayu Liu <ji...@hey.com.invalid>
> > > > > wrote:
> > > > >
> > > > > > Greetings Arrow dev community,
> > > > > >
> > > > > > I am not sure if voting will still be needed in the future but
> > I'd
> > > > > like
> > > > > > to propose a release of Apache Arrow Datafusion's Python
> > binding
> > > > > (which
> > > > > > now lives in https://github.com/datafusion-contrib/datafusion-
> > > > > python)
> > > > > > version 0.5.0.
> > > > > >
> > > > > > The release candidate is based on commit [1], the same commit
> > is on
> > > > > a
> > > > > > pull request at [2], and you can download the pre-built wheel
> > files
> > > > > via
> > > > > > link [3], which contains 3 wheel files (win, macOS, and
> > manylinux),
> > > > > > along with 1 source tarball.
> > > > > >
> > > > > > Only votes from PMC members are binding but all members of the
> > > > > community
> > > > > > are encouraged to test the release and vote with "(non-
> > binding)".
> > > > > >
> > > > > > [ ] +1 Release this as 0.5.0
> > > > > > [ ] +0
> > > > > > [ ] -1 Do not release this as 0.5.0 because...
> > > > > >
> > > > > >
> > > > > > [1]: https://github.com/datafusion-contrib/datafusion-
> > > > > > python/commit/ef82a992af3f41a3bcd057a7d98834910cefabe5
> > > > > > [2]: https://github.com/datafusion-contrib/datafusion-
> > python/pull/34
> > > > > > [3]: https://github.com/datafusion-contrib/datafusion-
> > > > > > python/suites/5590592272/artifacts/181291336
> > > > > >
> > > >
> > >
>

Reply via email to