hi Gonzalo,

This is interesting, thank you. Do you have code available to reproduce
these results?

- Wes

On Fri, Sep 15, 2017 at 9:28 AM, Gonzalo Ortiz Jaureguizar <
golthir...@gmail.com> wrote:

> I forgot to say that test were executed on my Ubuntu 17.04 laptop on
> Oracle JDK 1.8.0_144-b01.
>
> 2017-09-15 13:21 GMT+02:00 Gonzalo Ortiz Jaureguizar <golthir...@gmail.com
> >:
>
>> Hi there,
>>
>> I have created a little JMH test to check the Arrow performance. You can
>> found it here. The idea is to test an API with implementations on heap
>> arrays, nio buffers (that follow the arrow format) and Arrow. At this
>> moment the API only supports nullable int buffers and contains read only
>> methods.
>>
>> The benchmark run on automatically generated vectors of 2^10, 2^20 and
>> 2^26 never-null integers and it tests three different access patterns:
>>
>>    - Random access: Where a random element is read
>>    - Sequential access: Where a random index is chosen and then the
>>    following 32 elements are read
>>    - Sum access: Similar to sequential, but instead of simply read them,
>>    they are added into a long.
>>
>> Disclaimer: Microbenchmars are error prone and I'm not an expert on JMH
>> and this benchmark has been done in a couple of hours.
>>
>> Results
>> On all charts the Y axis is the relation between the throughput of the
>> offheap versions with the heap version (so the higher the better).
>>
>> TD;LR: It seems that the complex structures of Arrow are preventing some
>> optimizations on the JVM.
>>
>> Random
>> The random access is quite good. The heap version is a little bit better,
>> but both offheap solutions seems pretty similar.
>>
>>         1K      1M      64M
>> Array   75.139  53.025  10.872
>> Arrow   67.399  43.491  10.42
>> Buf     82.877  38.092  10.753
>> [image: Imágenes integradas 1]
>>
>> Sequential
>> If you see the absolute values, it is clear that JMH's blackhole is
>> preventing any JVM optimization on the loop. I think thats fine, as it
>> simulates several calls to the vector on a *not omptimized* scenario.
>> It seems that the JVM is not smart enough to optimize offheap sequential
>> as much as it does with heap structures. Although both offheap
>> implementations are worse than the heap version, the one that uses Arrow is
>> sensible worse than the one that directly uses ByteBuffers:
>>         1K      1M      64M
>> Array   6.335   4.563   3.145
>> Arrow   2.664   2.453   1.989
>> Buf     4.456   3.971   3.018
>> [image: Imágenes integradas 2]
>>
>> Sum
>> The result is awful. It seems that the JVM is able to optimize (I guess
>> vectorizing) the heap and ByteBuffer implementation (at least with small
>> vectors), but not in the case with the Arrow version. I guess it is due to
>> the indirections and deeper stack required to execute the same code on
>> Arrow.
>>
>>         1K      1M      64M
>> Array   44.833  26.617  9.787
>> Arrow   3.426   3.265   2.521
>> Buf     38.288  19.295  5.668
>> [image: Imágenes integradas 4]
>>
>>
>

Reply via email to