Hi Everyone, I am Praveen, another engineer working on Gandiva. The interest and speed of engagement around this is great !!Excited to engage with you folks on this.
Thx. On 2018/06/22 18:09:42, Julian Hyde <j...@apache.org> wrote: > This is exciting. We have wanted to build an Arrow adapter in Calcite for > some time and have a prototype (see > https://issues.apache.org/jira/browse/CALCITE-2173 > <https://issues.apache.org/jira/browse/CALCITE-2173>) but I hope that we can > use Gandiva. I know that Gandiva has Java bindings, but will these allow > queries to be compiled and executed from a pure Java process?> > > Can you describe Gandiva’s governance model? Without an open governance > model, companies that compete with Dremio may be wary about contributing.> > > Can you compare and contrast your approach to Hyper[1]? Hyper is also > concerned with efficient use to the bus, and also uses LLVM, but it has a > different memory format and places much emphasis on lock-free data > structures.> > > I just attended SIGMOD and there were interesting industry papers from > MemSQL[2][3] and Oracle RAPID[4]. I was impressed with some of the tricks > MemSQL uses to achieve SIMD parallelism on queries such as “select k4, sum(x) > from t group by k4” (where k4 has 4 values).> > > I missed part of the RAPID talk, but I got the impression that they are using > disk-based algorithms (e.g. hybrid hash join) to handle data spread between > fast and slow memory.> > > MemSQL uses TPC-H query 1 as a motivating benchmark and I think this would be > good target for Gandiva also. It is a table scan with a range filter > (returning 98% of rows), a low-cardinality aggregate (grouping by two fields > with 3 values each), and several aggregate functions, the arguments of which > contain common sub-expressions.> > > SELECT> > l_returnflag,> > l_linestatus,> > sum(l_quantity),> > sum(l_extendedprice),> > sum(l_extendedprice * (1 - l_discount)),> > sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)),> > avg(l_quantity),> > avg(l_extendedprice),> > avg(l_discount),> > count(*)> > FROM lineitem> > WHERE l_shipdate <= date '1998-12-01' - interval '90’ day> > GROUP BY> > l_returnflag,> > l_linestatus> > ORDER BY> > l_returnflag,> > l_linestatus;> > > Julian> > > [1] http://www.vldb.org/pvldb/vol4/p539-neumann.pdf > <http://www.vldb.org/pvldb/vol4/p539-neumann.pdf>> > > [2] > http://blog.memsql.com/how-careful-engineering-lead-to-processing-over-a-trillion-rows-per-second/ > > <http://blog.memsql.com/how-careful-engineering-lead-to-processing-over-a-trillion-rows-per-second/>> > > > [3] https://dl.acm.org/citation.cfm?id=3183713.3190658 > <https://dl.acm.org/citation.cfm?id=3183713.3190658>> > > [4] https://dl.acm.org/citation.cfm?id=3183713.3190655 > <https://dl.acm.org/citation.cfm?id=3183713.3190655>> > > > On Jun 22, 2018, at 7:22 AM, ravind...@gmail.com wrote:> > > > > > Hi everyone,> > > > > > I'm Ravindra and I'm a developer on the Gandiva project. I do believe that > > the combination of arrow and llvm for efficient expression evaluation is > > powerful, and has a broad range of use-cases. We've just started and hope > > to finesse and add a lot of functionality over the next few months.> > > > > > Welcome your feedback and participation in gandiva !!> > > > > > thanks & regards,> > > ravindra.> > > > > > On 2018/06/21 19:15:20, Jacques Nadeau <ja...@apache.org> wrote: > > >> Hey Guys,> > >> > > >> Dremio just open sourced a new framework for processing data in Arrow > >> data> > >> structures [1], built on top of the Apache Arrow C++ APIs and leveraging> > >> LLVM (Apache licensed). It also includes Java APIs that leverage the > >> Apache> > >> Arrow Java libraries. I expect the developers who have been working on > >> this> > >> will introduce themselves soon. To read more about it, take a look at our> > >> Ravindra's blog post (he's the lead developer driving this work): [2].> > >> Hopefully people will find this interesting/useful.> > >> > > >> Let us know what you all think!> > >> > > >> thanks,> > >> Jacques> > >> > > >> > > >> [1] https://github.com/dremio/gandiva> > >> [2] > >> https://www.dremio.com/announcing-gandiva-initiative-for-apache-arrow/> > >> > > >