I forgot to say that test were executed on my Ubuntu 17.04 laptop on Oracle JDK 1.8.0_144-b01.
2017-09-15 13:21 GMT+02:00 Gonzalo Ortiz Jaureguizar <golthir...@gmail.com>: > Hi there, > > I have created a little JMH test to check the Arrow performance. You can > found it here. The idea is to test an API with implementations on heap > arrays, nio buffers (that follow the arrow format) and Arrow. At this > moment the API only supports nullable int buffers and contains read only > methods. > > The benchmark run on automatically generated vectors of 2^10, 2^20 and > 2^26 never-null integers and it tests three different access patterns: > > - Random access: Where a random element is read > - Sequential access: Where a random index is chosen and then the > following 32 elements are read > - Sum access: Similar to sequential, but instead of simply read them, > they are added into a long. > > Disclaimer: Microbenchmars are error prone and I'm not an expert on JMH > and this benchmark has been done in a couple of hours. > > Results > On all charts the Y axis is the relation between the throughput of the > offheap versions with the heap version (so the higher the better). > > TD;LR: It seems that the complex structures of Arrow are preventing some > optimizations on the JVM. > > Random > The random access is quite good. The heap version is a little bit better, > but both offheap solutions seems pretty similar. > > 1K 1M 64M > Array 75.139 53.025 10.872 > Arrow 67.399 43.491 10.42 > Buf 82.877 38.092 10.753 > [image: Imágenes integradas 1] > > Sequential > If you see the absolute values, it is clear that JMH's blackhole is > preventing any JVM optimization on the loop. I think thats fine, as it > simulates several calls to the vector on a *not omptimized* scenario. > It seems that the JVM is not smart enough to optimize offheap sequential > as much as it does with heap structures. Although both offheap > implementations are worse than the heap version, the one that uses Arrow is > sensible worse than the one that directly uses ByteBuffers: > 1K 1M 64M > Array 6.335 4.563 3.145 > Arrow 2.664 2.453 1.989 > Buf 4.456 3.971 3.018 > [image: Imágenes integradas 2] > > Sum > The result is awful. It seems that the JVM is able to optimize (I guess > vectorizing) the heap and ByteBuffer implementation (at least with small > vectors), but not in the case with the Arrow version. I guess it is due to > the indirections and deeper stack required to execute the same code on > Arrow. > > 1K 1M 64M > Array 44.833 26.617 9.787 > Arrow 3.426 3.265 2.521 > Buf 38.288 19.295 5.668 > [image: Imágenes integradas 4] > >