Hi there, I have created a little JMH test to check the Arrow performance. You can found it here. The idea is to test an API with implementations on heap arrays, nio buffers (that follow the arrow format) and Arrow. At this moment the API only supports nullable int buffers and contains read only methods.
The benchmark run on automatically generated vectors of 2^10, 2^20 and 2^26 never-null integers and it tests three different access patterns: - Random access: Where a random element is read - Sequential access: Where a random index is chosen and then the following 32 elements are read - Sum access: Similar to sequential, but instead of simply read them, they are added into a long. Disclaimer: Microbenchmars are error prone and I'm not an expert on JMH and this benchmark has been done in a couple of hours. Results On all charts the Y axis is the relation between the throughput of the offheap versions with the heap version (so the higher the better). TD;LR: It seems that the complex structures of Arrow are preventing some optimizations on the JVM. Random The random access is quite good. The heap version is a little bit better, but both offheap solutions seems pretty similar. 1K 1M 64M Array 75.139 53.025 10.872 Arrow 67.399 43.491 10.42 Buf 82.877 38.092 10.753 [image: Imágenes integradas 1] Sequential If you see the absolute values, it is clear that JMH's blackhole is preventing any JVM optimization on the loop. I think thats fine, as it simulates several calls to the vector on a *not omptimized* scenario. It seems that the JVM is not smart enough to optimize offheap sequential as much as it does with heap structures. Although both offheap implementations are worse than the heap version, the one that uses Arrow is sensible worse than the one that directly uses ByteBuffers: 1K 1M 64M Array 6.335 4.563 3.145 Arrow 2.664 2.453 1.989 Buf 4.456 3.971 3.018 [image: Imágenes integradas 2] Sum The result is awful. It seems that the JVM is able to optimize (I guess vectorizing) the heap and ByteBuffer implementation (at least with small vectors), but not in the case with the Arrow version. I guess it is due to the indirections and deeper stack required to execute the same code on Arrow. 1K 1M 64M Array 44.833 26.617 9.787 Arrow 3.426 3.265 2.521 Buf 38.288 19.295 5.668 [image: Imágenes integradas 4]