yes, this would be great to have a component/library, that can be embedded in any other product and be able to perform operations like aggregation/join/filter/etc with arrow datasets. Do you think it is really hard to extract this part out of dremio-oss ?
Sincerely, Michael Shtelma On Sat, Jul 22, 2017 at 2:13 AM, Jacques Nadeau <jacq...@apache.org> wrote: > We do have relational operators as well in our code. We're trying to figure > out what to contribute back and how to factor. For now, the code is under > Apache license you are free to use. Our relational operations are under > here: > > https://github.com/dremio/dremio-oss/tree/master/sabot/ > kernel/src/main/java/com/dremio/sabot/op > > For example, you can see how we do a columnar pivot for the purposes of > aggregation here: > https://github.com/dremio/dremio-oss/blob/master/sabot/ > kernel/src/main/java/com/dremio/sabot/op/aggregate/vectorized/ > VectorizedHashAggOperator.java > > and here (pivotVariableLengths is especially fun): > https://github.com/dremio/dremio-oss/blob/master/sabot/ > kernel/src/main/java/com/dremio/sabot/op/common/ht2/Pivots.java > > Our goal is ultimately to make Sabot componentized enough that you can use > pieces as a library but it will take some time to get all the way there. > > > > On Fri, Jul 21, 2017 at 4:35 AM, Michael Shtelma <mshte...@gmail.com> > wrote: > > > Hi Wes, > > > > It is really great, that you have open-sourced all this! > > As far as I understand, you have also open-sourced the engine that can > > execute relational operators on arrow ? > > Is it possible to use it as library ? > > Are you also planning to donate it arrow project at some point? > > > > Sincerely, > > Michael Shtelma > > > > On Thu, Jul 20, 2017 at 10:19 PM, Wes McKinney <wesmck...@gmail.com> > > wrote: > > > > > hi Sven, > > > > > > There is a placeholder project in apache/parquet-mr > > > https://github.com/apache/parquet-mr/tree/master/parquet-arrow. > > > > > > It appears in the meantime that Dremio has created a vectorized > > > Parquet <-> Arrow reader/writer which has just been open sourced under > > > ASL 2.0: https://github.com/dremio/dremio-oss/tree/master/sabot/ > > > kernel/src/main/java/com/dremio/exec/store/parquet > > > > > > I am sure they are very busy right now, but it may be worth discussing > > > factoring out this Parquet <-> Arrow interface into a library > > > component that can be donated to Apache Parquet. > > > > > > - Wes > > > > > > On Wed, Jul 19, 2017 at 4:28 PM, Sven Wagner-Boysen > > > <sven.wagner-boy...@signavio.com> wrote: > > > > Hi, > > > > > > > > I started looking into the projects Parquet and Arrow. Looks very > > > promising > > > > to me. > > > > > > > > I also came across PyArrow and the Parquet-Arrow integration in > Python. > > > Is > > > > there something similar available for Java? > > > > > > > > https://arrow.apache.org/docs/python/parquet.html > > > > > > > > Thanks > > > > Sven > > > > > >