Re: Arrow Rust roadmapping [was Re: [Gandiva] Representing logical query plans in protobuf]

2019-01-06 Thread Uwe L. Korn
Hello Andy, one thing that we had in discussions in the past and also opened me up a bit to the parquet-cpp merge is that merging code into a repo doesn't mean that it will reside always there. Apache has the infrastructure and guidelines to split a part of a project into a separate one. This i

Re: Arrow Rust roadmapping [was Re: [Gandiva] Representing logical query plans in protobuf]

2019-01-05 Thread Andy Grove
Hi Wes, Yes, I have a SQL parser (actually this is a separate crate) and DataFusion has the query planner and execution engine. Here is a blog post from last summer with some performance comparisons with Apache Spark: https://andygrove.io/2018/05/datafusion-aggregate-performance/ I have recently

Re: Arrow Rust roadmapping [was Re: [Gandiva] Representing logical query plans in protobuf]

2019-01-05 Thread Wes McKinney
hi Andy, On Sat, Jan 5, 2019 at 3:59 PM Andy Grove wrote: > > Thanks Neville for starting this discussion. > > The next set of things I am interested now that we have some primitive > operators in place is performing aggregate queries over a sequence of > RecordBatches (in fact I just got that wo

Re: Arrow Rust roadmapping [was Re: [Gandiva] Representing logical query plans in protobuf]

2019-01-05 Thread Andy Grove
Thanks Neville for starting this discussion. The next set of things I am interested now that we have some primitive operators in place is performing aggregate queries over a sequence of RecordBatches (in fact I just got that working in DataFusion this morning) and then moving onto other SQL featur

Re: Arrow Rust roadmapping [was Re: [Gandiva] Representing logical query plans in protobuf]

2019-01-05 Thread Neville Dipale
Hi Wes, I'm aware of your expressions re. the amount of work that leadership on OSS projects takes, and for the time aspect, one has to just look at another's local timezone to see even the hours and days which another works. To be proactive, I'll hash together such rough roadmap for Rust, and sh

Arrow Rust roadmapping [was Re: [Gandiva] Representing logical query plans in protobuf]

2019-01-05 Thread Wes McKinney
hi Neville, On Sat, Jan 5, 2019 at 2:37 PM Neville Dipale wrote: > > Hi Andy & Wes, > > Apologies if I go off-topic a bit, I hope my thoughts are related though. > > I'm a new contributor to Arrow, but I've been using and following it since > the feather days. I'm interested in contributing to Ru

Re: [Gandiva] Representing logical query plans in protobuf

2019-01-05 Thread Neville Dipale
Hi Andy & Wes, Apologies if I go off-topic a bit, I hope my thoughts are related though. I'm a new contributor to Arrow, but I've been using and following it since the feather days. I'm interested in contributing to Rust, as that aligns more with my day job(s). I think we (rather, the Rust contr

Re: [Gandiva] Representing logical query plans in protobuf

2019-01-05 Thread Andy Grove
Wes, That makes sense. I'll create a fresh PR to add a new protobuf under the Rust module for now (even though this won't be Rust specific). Thanks, Andy. On Sat, Jan 5, 2019 at 9:19 AM Wes McKinney wrote: > hey Andy, > > I replied on GitHub and then saw your e-mail thread. > > The Gandiva

Re: [Gandiva] Representing logical query plans in protobuf

2019-01-05 Thread Wes McKinney
hey Andy, I replied on GitHub and then saw your e-mail thread. The Gandiva library as it stands right now is not a query engine or an execution engine, properly speaking. It is a subgraph compiler for creating accelerated expressions for use inside another execution or query engine, like it is be

[Gandiva] Representing logical query plans in protobuf

2019-01-05 Thread Andy Grove
I have created a PR to start a discussion around representing logical query plans in Gandiva (ARROW-4163). https://github.com/apache/arrow/pull/3319 I think that adding the various steps such as projection, selection, sort, and so on are fairly simple and not contentious. The harder part is how w