yes, this would be great to have a component/library, that can be embedded
in any other product and be able to perform operations like
aggregation/join/filter/etc with arrow datasets.
Do you think it is really hard to extract this part out of dremio-oss ?


Sincerely,
Michael Shtelma

On Sat, Jul 22, 2017 at 2:13 AM, Jacques Nadeau <jacq...@apache.org> wrote:

> We do have relational operators as well in our code. We're trying to figure
> out what to contribute back and how to factor. For now, the code is under
> Apache license you are free to use. Our relational operations are under
> here:
>
> https://github.com/dremio/dremio-oss/tree/master/sabot/
> kernel/src/main/java/com/dremio/sabot/op
>
> For example, you can see how we do a columnar pivot for the purposes of
> aggregation here:
> https://github.com/dremio/dremio-oss/blob/master/sabot/
> kernel/src/main/java/com/dremio/sabot/op/aggregate/vectorized/
> VectorizedHashAggOperator.java
>
> and here (pivotVariableLengths is especially fun):
> https://github.com/dremio/dremio-oss/blob/master/sabot/
> kernel/src/main/java/com/dremio/sabot/op/common/ht2/Pivots.java
>
> Our goal is ultimately to make Sabot componentized enough that you can use
> pieces as a library but it will take some time to get all the way there.
>
>
>
> On Fri, Jul 21, 2017 at 4:35 AM, Michael Shtelma <mshte...@gmail.com>
> wrote:
>
> > Hi Wes,
> >
> > It is really great, that you have open-sourced all this!
> > As far as I understand, you have also open-sourced the engine that can
> > execute relational operators on arrow ?
> > Is it possible to use it as library ?
> > Are you also planning to donate it arrow project at some point?
> >
> > Sincerely,
> > Michael Shtelma
> >
> > On Thu, Jul 20, 2017 at 10:19 PM, Wes McKinney <wesmck...@gmail.com>
> > wrote:
> >
> > > hi Sven,
> > >
> > > There is a placeholder project in apache/parquet-mr
> > > https://github.com/apache/parquet-mr/tree/master/parquet-arrow.
> > >
> > > It appears in the meantime that Dremio has created a vectorized
> > > Parquet <-> Arrow reader/writer which has just been open sourced under
> > > ASL 2.0: https://github.com/dremio/dremio-oss/tree/master/sabot/
> > > kernel/src/main/java/com/dremio/exec/store/parquet
> > >
> > > I am sure they are very busy right now, but it may be worth discussing
> > > factoring out this Parquet <-> Arrow interface into a library
> > > component that can be donated to Apache Parquet.
> > >
> > > - Wes
> > >
> > > On Wed, Jul 19, 2017 at 4:28 PM, Sven Wagner-Boysen
> > > <sven.wagner-boy...@signavio.com> wrote:
> > > > Hi,
> > > >
> > > > I started looking into the projects Parquet and Arrow. Looks very
> > > promising
> > > > to me.
> > > >
> > > > I also came across PyArrow and the Parquet-Arrow integration in
> Python.
> > > Is
> > > > there something similar available for Java?
> > > >
> > > > https://arrow.apache.org/docs/python/parquet.html
> > > >
> > > > Thanks
> > > > Sven
> > >
> >
>

Reply via email to