hi Eric, Antoine recently did some work on faster bitsetting in C++ by unrolling the main loop to set one byte at a time
https://github.com/apache/arrow/blob/27b869ae5df31f3be61e76e99999d96ea7d9b557/cpp/src/arrow/util/bit-util.h#L598 This yielded major speedups when setting a lot of bits. A similar strategy should be possible in Java for your use case. We speculated that it could be made even faster by eliminating the branch in the bit-setting assignments (the g() | left_branch : right_branch statements). If you dig around in the Dremio codebase you can find plenty of low level off-heap memory manipulation that may be helpful (others may be able to comment). If some utilities could be developed here in the Arrow Java codebase for common benefit, that would be great. Otherwise copying the values data without branching is an obvious optimization. Others may have ideas - Wes On Mon, Jul 23, 2018 at 5:50 PM, Eric Wohlstadter <wohls...@gmail.com> wrote: > Hi all, > I work on a project that uses Arrow streaming format to transfer data > between Java processes. > We're also following the progress on Java support for Plasma, and may > decide use Plasma also. > > We typically uses a pattern like this to fill Arrow vectors from Java > arrays: > ---- > int[] inputValues = ...; > boolean[] nullInputValues = ...; > > org.apache.arrow.vector.IntVector vector = ...; > for(int i = 0; i < inputValues.size; i++) { > if(nullInputValues[i]) { > vector.setNull(i); > } else { > vector.set(i, inputValues[i]); > } > } > ---- > > Obviously the JIT won't be able to vectorize this loop. Does anyone know if > there is another way to achieve this which > would be vectorized? > > Here is a pseudo-code mockup of what I was thinking about, is this approach > worth pursuing? > > The idea is to try to convert input into Arrow format in a vectorized loop, > and then use sun.misc.Unsafe to copy the > converted on-heap input to an off-heap valueBuffer. > > I'll ignore the details of the validityBuffer here, since it would follow > along the same lines: > > ---- > int[] inputValues = ...; > org.apache.arrow.vector.IntVector vector = ...; > > for(int i = 0; i < inputValues.size; i++) { > //convert inputValues[i] to little-endian > //this conversion can be SIMD vectorized? > } > UNSAFE.copyMemory( > inputValues, > 0, > null, > vector.getDataBuffer().memoryAddress(), > sizeof(Integer.class) * inputValues.size > ); > ---- > > Thanks for any feedback about details I may be misunderstanding, which > would make this approach infeasible.