hi Eric,

Antoine recently did some work on faster bitsetting in C++ by
unrolling the main loop to set one byte at a time

https://github.com/apache/arrow/blob/27b869ae5df31f3be61e76e99999d96ea7d9b557/cpp/src/arrow/util/bit-util.h#L598

This yielded major speedups when setting a lot of bits. A similar
strategy should be possible in Java for your use case. We speculated
that it could be made even faster by eliminating the branch in the
bit-setting assignments (the g() | left_branch : right_branch
statements). If you dig around in the Dremio codebase you can find
plenty of low level off-heap memory manipulation that may be helpful
(others may be able to comment).

If some utilities could be developed here in the Arrow Java codebase
for common benefit, that would be great.

Otherwise copying the values data without branching is an obvious
optimization. Others may have ideas

- Wes

On Mon, Jul 23, 2018 at 5:50 PM, Eric Wohlstadter <wohls...@gmail.com> wrote:
> Hi all,
>   I work on a project that uses Arrow streaming format to transfer data
> between Java processes.
> We're also following the progress on Java support for Plasma, and may
> decide use Plasma also.
>
> We typically uses a pattern like this to fill Arrow vectors from Java
> arrays:
> ----
> int[] inputValues = ...;
> boolean[] nullInputValues = ...;
>
> org.apache.arrow.vector.IntVector vector = ...;
> for(int i = 0; i < inputValues.size; i++) {
>   if(nullInputValues[i]) {
>     vector.setNull(i);
>   } else {
>     vector.set(i, inputValues[i]);
>   }
> }
> ----
>
> Obviously the JIT won't be able to vectorize this loop. Does anyone know if
> there is another way to achieve this which
> would be vectorized?
>
> Here is a pseudo-code mockup of what I was thinking about, is this approach
> worth pursuing?
>
> The idea is to try to convert input into Arrow format in a vectorized loop,
> and then use sun.misc.Unsafe to copy the
> converted on-heap input to an off-heap valueBuffer.
>
> I'll ignore the details of the validityBuffer here, since it would follow
> along the same lines:
>
> ----
> int[] inputValues = ...;
> org.apache.arrow.vector.IntVector vector = ...;
>
> for(int i = 0; i < inputValues.size; i++) {
>   //convert inputValues[i] to little-endian
>   //this conversion can be SIMD vectorized?
> }
> UNSAFE.copyMemory(
>   inputValues,
>   0,
>   null,
>   vector.getDataBuffer().memoryAddress(),
>   sizeof(Integer.class) * inputValues.size
> );
> ----
>
> Thanks for any feedback about details I may be misunderstanding, which
> would make this approach infeasible.

Reply via email to