On Mon, Nov 9, 2020 at 9:32 AM Niranda Perera <niranda.per...@gmail.com> wrote: > > @Ben > Thank you very much for the feedback. But unfortunately, I was unable to > find a header that exposes a SumAggregateKernel in the v2.0.0. Maybe I am > checking it wrong. I remember accessing them in v0.16 IINM. > > @Wes > Yes, that would be great. How about adding a CMake compilation flag for > such dev use cases? >
This seems like it could cause more problems -- I think it would be sufficient to use an "internal::" C++ namespace and always install the relevant header file > > > On Sun, Nov 8, 2020 at 9:14 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > I'm not opposed to installing headers that provide access to some of > > the kernel implementation internals (with the caveat that changes > > won't go through a deprecation cycle, so caveat emptor). It might be > > more sustainable to think about what kind of stable-ish public API > > could be exported to support applications like Cylon. > > > > On Sun, Nov 8, 2020 at 10:37 AM Ben Kietzman <b...@ursacomputing.com> > > wrote: > > > > > > Hi Niranda, > > > > > > SumImpl is a subclass of KernelState. Given a SumAggregateKernel, one can > > > produce zeroed KernelState using the `init` member, then operate on data > > > using the `consume`, `merge`, and `finalize` members. You can look at > > > ScalarAggExecutor for an example of how to get from a compute function to > > > kernels and kernel state. Will that work for you? > > > > > > Ben Kietzman > > > > > > On Sun, Nov 8, 2020, 11:21 Niranda Perera <niranda.per...@gmail.com> > > wrote: > > > > > > > Hi Ben, > > > > > > > > We are building a distributed table abstraction on top of Arrow > > dataframes > > > > called Cylon (https://github.com/cylondata/cylon). Currently we have a > > > > simple aggregation and group-by operation implementation. But we felt > > like > > > > we can give more functionality if we can import arrow kernels and > > states to > > > > corresponding cylon distributed kernels. > > > > Ex: For distributed mean, we would have to communicate the local arrow > > > > SumState and then do a SumImpl::MergeFrom() and the call Finalize. > > > > Is there any other way to access these intermediate states from compute > > > > operations? > > > > > > > > On Sun, Nov 8, 2020 at 11:11 AM Ben Kietzman <b...@ursacomputing.com> > > > > wrote: > > > > > > > > > Ni Niranda, > > > > > > > > > > What is the context of your work? if you're working inside the arrow > > > > > repository you shouldn't need to install headers before using them, > > and > > > > we > > > > > welcome PRs for new kernels. Otherwise, could you provide some > > details > > > > > about how your work is using Arrow as a dependency? > > > > > > > > > > Ben Kietzman > > > > > > > > > > On Sun, Nov 8, 2020, 10:57 Niranda Perera <niranda.per...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > I was wondering if I could use the > > arrow/compute/kernels/*internal.h > > > > > > headers in my work? I would like to reuse some of the kernel > > > > > > implementations and kernel states. > > > > > > > > > > > > With -DARROW_COMPUTE=ON, those headers are not added into the > > include > > > > > dir. > > > > > > I see that the *internal.h headers are skipped from > > > > > > the ARROW_INSTALL_ALL_HEADERS cmake function unfortunately. > > > > > > > > > > > > Best > > > > > > -- > > > > > > Niranda Perera > > > > > > @n1r44 <https://twitter.com/N1R44> > > > > > > +1 812 558 8884 / +94 71 554 8430 > > > > > > https://www.linkedin.com/in/niranda > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Niranda Perera > > > > @n1r44 <https://twitter.com/N1R44> > > > > +1 812 558 8884 / +94 71 554 8430 > > > > https://www.linkedin.com/in/niranda > > > > > > > > > -- > Niranda Perera > @n1r44 <https://twitter.com/N1R44> > +1 812 558 8884 / +94 71 554 8430 > https://www.linkedin.com/in/niranda