Re: Gandiva

Walaa Eldin Moustafa Sat, 30 Jun 2018 14:35:36 -0700

Hi Julian and Masayuki,

This indeed sounds quite important. Masayuki, thanks for taking the
initiative. I would like to do I what I can to help. I can help with
writing some of the operators, UDFs/UDF APIs, and integration with Calcite.


Thanks,
Walaa.


On Fri, Jun 29, 2018 at 11:40 AM Julian Hyde <[email protected]> wrote:

> We already have two JIRA cases for Arrow integration:
> https://issues.apache.org/jira/browse/CALCITE-2040 and
> https://issues.apache.org/jira/browse/CALCITE-2173.
>
> I think this is an extremely important area of work for the Calcite
> project, because it helps us realize the vision of a deconstructed
> database[1]. There is a lot of work to do, much of it very interesting
> (e.g. writing a thread scheduler, IPC mechanisms, and algorithms for
> sort, join and aggregation that work effectively on Arrow data
> structures).
>
> If you want to help Masayuki, please step up!
>
> Julian
>
> [1]
> https://www.slideshare.net/julienledem/from-flat-files-to-deconstructed-database
>
> On Thu, Jun 28, 2018 at 2:24 PM, Michael Mior <[email protected]> wrote:
> > That's great! If you could create a JIRA case to track your progress,
> that
> > would be helpful for others who might want to follow along or contribute.
> > Thanks!
> >
> > --
> > Michael Mior
> > [email protected]
> >
> >
> >
> > Le mar. 26 juin 2018 à 10:36, Masayuki Takahashi <[email protected]>
> a
> > écrit :
> >
> >> Hi Julian,
> >>
> >> > Masayuki Takahashi has started to develop an Arrow adapter for
> >> Calcite[2], but a lot of work remains to implement all SQL built-in
> >> functions and basic relational operators. Building on top of Gandiva we
> >> could save a lot of this effort.
> >>
> >> I will start to build Gandiva development environment and try to
> >> consider a way to incorporate.
> >>
> >> thanks.
> >>
> >>
> >>
> >> 2018年6月23日(土) 3:54 Julian Hyde <[email protected]>:
> >> >
> >> > Suppose a company wishes to build a graph database using their own
> >> innovative graph index data structure. They nevertheless need to
> implement
> >> core relational algebra, core data types, and core built-in functions
> (+,
> >> CASE, SUM, SUBSTRING). And they want to implement these on a
> >> memory-efficient data structure (tens of thousands of rows, stored
> >> column-oriented, per memory block). This is a massive effort.
> >> >
> >> > With Calcite+Gandiva+Arrow they just need to create a sequence of
> >> relational operators (using RelBuilder, say) and efficient machine code
> is
> >> generated. They can then start adding their own data types, built-in
> >> functions, and relational operators, using the same architecture.
> >> >
> >> > Julian
> >> >
> >> >
> >> > > On Jun 22, 2018, at 11:33 AM, Xiening Dai <[email protected]>
> wrote:
> >> > >
> >> > > I was in a talk regarding Gandiva yesterday. Impressive work!
> >> > >
> >> > > But I am not sure why Calcite would like to integrate with it. To me
> >> Gandiva is on execution side, in which scenarios a query planner would
> need
> >> a arrow engine? I read the original Jira about implementing file
> >> enumerator, but the intent is still not clear to me. Would appreciate if
> >> you can elaborate. Thanks.
> >> > >
> >> > >
> >> > >> On Jun 22, 2018, at 11:20 AM, Julian Hyde <[email protected]>
> wrote:
> >> > >>
> >> > >> There is a discussion on dev@arrow about Gandiva, a kernel for
> >> Arrow[1].
> >> > >>
> >> > >> I think it would be an interesting library on which to build our
> >> Arrow engine. (Without a kernel, Arrow is just a data format, but with
> >> Gandiva it becomes an engine upon which we can implement all relational
> >> operations, albeit on a multi-threaded single node. Potentially this
> >> approach can process each row in a few machine cycles, i.e. billions of
> >> records per second. Therefore single-node would be sufficient for many
> >> queries.)
> >> > >>
> >> > >> Masayuki Takahashi has started to develop an Arrow adapter for
> >> Calcite[2], but a lot of work remains to implement all SQL built-in
> >> functions and basic relational operators. Building on top of Gandiva we
> >> could save a lot of this effort.
> >> > >>
> >> > >> Julian
> >> > >>
> >> > >> [1]
> >>
> https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E
> >> <
> >>
> https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E
> >> >
> >> > >>
> >> > >> [2] https://issues.apache.org/jira/browse/CALCITE-2173 <
> >> https://issues.apache.org/jira/browse/CALCITE-2173>
> >> > >
> >> >
> >>
> >>
> >> --
> >> 高橋 真之
> >>
>

Re: Gandiva

Reply via email to