Suppose a company wishes to build a graph database using their own innovative 
graph index data structure. They nevertheless need to implement core relational 
algebra, core data types, and core built-in functions (+, CASE, SUM, 
SUBSTRING). And they want to implement these on a memory-efficient data 
structure (tens of thousands of rows, stored column-oriented, per memory 
block). This is a massive effort.

With Calcite+Gandiva+Arrow they just need to create a sequence of relational 
operators (using RelBuilder, say) and efficient machine code is generated. They 
can then start adding their own data types, built-in functions, and relational 
operators, using the same architecture.

Julian


> On Jun 22, 2018, at 11:33 AM, Xiening Dai <[email protected]> wrote:
> 
> I was in a talk regarding Gandiva yesterday. Impressive work!
> 
> But I am not sure why Calcite would like to integrate with it. To me Gandiva 
> is on execution side, in which scenarios a query planner would need a arrow 
> engine? I read the original Jira about implementing file enumerator, but the 
> intent is still not clear to me. Would appreciate if you can elaborate. 
> Thanks.
> 
> 
>> On Jun 22, 2018, at 11:20 AM, Julian Hyde <[email protected]> wrote:
>> 
>> There is a discussion on dev@arrow about Gandiva, a kernel for Arrow[1].
>> 
>> I think it would be an interesting library on which to build our Arrow 
>> engine. (Without a kernel, Arrow is just a data format, but with Gandiva it 
>> becomes an engine upon which we can implement all relational operations, 
>> albeit on a multi-threaded single node. Potentially this approach can 
>> process each row in a few machine cycles, i.e. billions of records per 
>> second. Therefore single-node would be sufficient for many queries.)
>> 
>> Masayuki Takahashi has started to develop an Arrow adapter for Calcite[2], 
>> but a lot of work remains to implement all SQL built-in functions and basic 
>> relational operators. Building on top of Gandiva we could save a lot of this 
>> effort.
>> 
>> Julian
>> 
>> [1] 
>> https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E
>>  
>> <https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E>
>> 
>> [2] https://issues.apache.org/jira/browse/CALCITE-2173 
>> <https://issues.apache.org/jira/browse/CALCITE-2173>
> 

Reply via email to