Thanks Wes. Arrow is a very exciting project.
I'm from Arm. We are interested in arrow and would like to study and help 
improving arrow.

Yibo

On 11/1/19 1:25 AM, Wes McKinney wrote:
hi

On Thu, Oct 31, 2019 at 12:11 AM Yibo Cai <yibo....@arm.com> wrote:

Hi,

Arrow cpp integrates Gandiva to provide low level operations on arrow buffers. 
[1][2]
I have some questions, any help is appreciated:
- Arrow cpp already has a compute kernel[3], does it duplicate what Gandiva 
provides? I see a Jira talk about it.[4]

No. There are some cases of functional overlap but we are servicing a
spectrum of use cases beyond the scope of Gandiva. Additionally, it is
unclear to me that an LLVM JIT compilation step should be required to
evaluate simple expressions such as "a > 5" -- in addition to
introducing latency (due to the compilation step) it is also a heavy
dependency to require the LLVM runtime in all applications.

Personally I'm interested in supporting a wide gamut of analytics
workloads, from data frame / data science type libraries to SQL-like
systems. Gandiva is designed for the needs of a SQL-based execution
engine where chunks of data are fed into Projection or Filter nodes in
a computation graph -- Gandiva generates a specialized kernel to
perform a unit of work inside those nodes. Realistically, I expect
many real world applications will contain a mixture of pre-compiled
analytic kernels and JIT-compiled kernels.

Rome wasn't built in a day, so I'm expecting several years of work
ahead of us at the present rate. We need more help in this domain.

- Is Gandiva only for arrow cpp? What about other languages(go, rust, ...)?

It's being used in Java via JNI. The same approach could be applied
for the other languages as they have their own C FFI mechanisms.

- Gandiva leverages SIMD for vectorized operations[1], but I didn't see any 
related code. Am I missing something?

My understanding is that LLVM inserts many SIMD instructions
automatically based on the host CPU architecture version. Gandiva
developers may have some comments / pointers about this


[1] https://www.dremio.com/announcing-gandiva-initiative-for-apache-arrow/
[2] https://github.com/apache/arrow/tree/master/cpp/src/gandiva
[3] https://github.com/apache/arrow/tree/master/cpp/src/arrow/compute
[4] https://issues.apache.org/jira/browse/ARROW-7017

Thanks,
Yibo

Reply via email to