On Fri, Nov 1, 2019 at 10:41 AM Yibo Cai <yibo....@arm.com> wrote: > Thanks Wes. Arrow is a very exciting project. > I'm from Arm. We are interested in arrow and would like to study and help > improving arrow. >
If you are familiar with LLVM/JIT, you could help us with improving the optimisation passes in gandiva (tweaking existing ones or adding new ones or any other tricks ..) > > Yibo > > On 11/1/19 1:25 AM, Wes McKinney wrote: > > hi > > > > On Thu, Oct 31, 2019 at 12:11 AM Yibo Cai <yibo....@arm.com> wrote: > >> > >> Hi, > >> > >> Arrow cpp integrates Gandiva to provide low level operations on arrow > buffers. [1][2] > >> I have some questions, any help is appreciated: > >> - Arrow cpp already has a compute kernel[3], does it duplicate what > Gandiva provides? I see a Jira talk about it.[4] > > > > No. There are some cases of functional overlap but we are servicing a > > spectrum of use cases beyond the scope of Gandiva. Additionally, it is > > unclear to me that an LLVM JIT compilation step should be required to > > evaluate simple expressions such as "a > 5" -- in addition to > > introducing latency (due to the compilation step) it is also a heavy > > dependency to require the LLVM runtime in all applications. > > > > Personally I'm interested in supporting a wide gamut of analytics > > workloads, from data frame / data science type libraries to SQL-like > > systems. Gandiva is designed for the needs of a SQL-based execution > > engine where chunks of data are fed into Projection or Filter nodes in > > a computation graph -- Gandiva generates a specialized kernel to > > perform a unit of work inside those nodes. Realistically, I expect > > many real world applications will contain a mixture of pre-compiled > > analytic kernels and JIT-compiled kernels. > > > > Rome wasn't built in a day, so I'm expecting several years of work > > ahead of us at the present rate. We need more help in this domain. > > > >> - Is Gandiva only for arrow cpp? What about other languages(go, rust, > ...)? > > > > It's being used in Java via JNI. The same approach could be applied > > for the other languages as they have their own C FFI mechanisms. > > > >> - Gandiva leverages SIMD for vectorized operations[1], but I didn't see > any related code. Am I missing something? > > > > My understanding is that LLVM inserts many SIMD instructions > > automatically based on the host CPU architecture version. Gandiva > > developers may have some comments / pointers about this > > > >> > >> [1] > https://www.dremio.com/announcing-gandiva-initiative-for-apache-arrow/ > >> [2] https://github.com/apache/arrow/tree/master/cpp/src/gandiva > >> [3] https://github.com/apache/arrow/tree/master/cpp/src/arrow/compute > >> [4] https://issues.apache.org/jira/browse/ARROW-7017 > >> > >> Thanks, > >> Yibo > -- Thanks and regards, Ravindra.