Hi Julian, > Masayuki Takahashi has started to develop an Arrow adapter for Calcite[2], > but a lot of work remains to implement all SQL built-in functions and basic > relational operators. Building on top of Gandiva we could save a lot of this > effort.
I will start to build Gandiva development environment and try to consider a way to incorporate. thanks. 2018年6月23日(土) 3:54 Julian Hyde <[email protected]>: > > Suppose a company wishes to build a graph database using their own innovative > graph index data structure. They nevertheless need to implement core > relational algebra, core data types, and core built-in functions (+, CASE, > SUM, SUBSTRING). And they want to implement these on a memory-efficient data > structure (tens of thousands of rows, stored column-oriented, per memory > block). This is a massive effort. > > With Calcite+Gandiva+Arrow they just need to create a sequence of relational > operators (using RelBuilder, say) and efficient machine code is generated. > They can then start adding their own data types, built-in functions, and > relational operators, using the same architecture. > > Julian > > > > On Jun 22, 2018, at 11:33 AM, Xiening Dai <[email protected]> wrote: > > > > I was in a talk regarding Gandiva yesterday. Impressive work! > > > > But I am not sure why Calcite would like to integrate with it. To me > > Gandiva is on execution side, in which scenarios a query planner would need > > a arrow engine? I read the original Jira about implementing file > > enumerator, but the intent is still not clear to me. Would appreciate if > > you can elaborate. Thanks. > > > > > >> On Jun 22, 2018, at 11:20 AM, Julian Hyde <[email protected]> wrote: > >> > >> There is a discussion on dev@arrow about Gandiva, a kernel for Arrow[1]. > >> > >> I think it would be an interesting library on which to build our Arrow > >> engine. (Without a kernel, Arrow is just a data format, but with Gandiva > >> it becomes an engine upon which we can implement all relational > >> operations, albeit on a multi-threaded single node. Potentially this > >> approach can process each row in a few machine cycles, i.e. billions of > >> records per second. Therefore single-node would be sufficient for many > >> queries.) > >> > >> Masayuki Takahashi has started to develop an Arrow adapter for Calcite[2], > >> but a lot of work remains to implement all SQL built-in functions and > >> basic relational operators. Building on top of Gandiva we could save a lot > >> of this effort. > >> > >> Julian > >> > >> [1] > >> https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E > >> > >> <https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E> > >> > >> [2] https://issues.apache.org/jira/browse/CALCITE-2173 > >> <https://issues.apache.org/jira/browse/CALCITE-2173> > > > -- 高橋 真之
