Re: A simple benchmark on Java implementation

Gonzalo Ortiz Jaureguizar Fri, 15 Sep 2017 06:41:41 -0700

Yeah... I said "you can found it here" but forgot to add the link. My bad.
You can found it here <https://github.com/gortiz/arrow-jmh>.


2017-09-15 15:33 GMT+02:00 Wes McKinney <wesmck...@gmail.com>:

> hi Gonzalo,
>
> This is interesting, thank you. Do you have code available to reproduce
> these results?
>
> - Wes
>
> On Fri, Sep 15, 2017 at 9:28 AM, Gonzalo Ortiz Jaureguizar <
> golthir...@gmail.com> wrote:
>
> > I forgot to say that test were executed on my Ubuntu 17.04 laptop on
> > Oracle JDK 1.8.0_144-b01.
> >
> > 2017-09-15 13:21 GMT+02:00 Gonzalo Ortiz Jaureguizar <
> golthir...@gmail.com
> > >:
> >
> >> Hi there,
> >>
> >> I have created a little JMH test to check the Arrow performance. You can
> >> found it here. The idea is to test an API with implementations on heap
> >> arrays, nio buffers (that follow the arrow format) and Arrow. At this
> >> moment the API only supports nullable int buffers and contains read only
> >> methods.
> >>
> >> The benchmark run on automatically generated vectors of 2^10, 2^20 and
> >> 2^26 never-null integers and it tests three different access patterns:
> >>
> >>    - Random access: Where a random element is read
> >>    - Sequential access: Where a random index is chosen and then the
> >>    following 32 elements are read
> >>    - Sum access: Similar to sequential, but instead of simply read them,
> >>    they are added into a long.
> >>
> >> Disclaimer: Microbenchmars are error prone and I'm not an expert on JMH
> >> and this benchmark has been done in a couple of hours.
> >>
> >> Results
> >> On all charts the Y axis is the relation between the throughput of the
> >> offheap versions with the heap version (so the higher the better).
> >>
> >> TD;LR: It seems that the complex structures of Arrow are preventing some
> >> optimizations on the JVM.
> >>
> >> Random
> >> The random access is quite good. The heap version is a little bit
> better,
> >> but both offheap solutions seems pretty similar.
> >>
> >>         1K      1M      64M
> >> Array   75.139  53.025  10.872
> >> Arrow   67.399  43.491  10.42
> >> Buf     82.877  38.092  10.753
> >> [image: Imágenes integradas 1]
> >>
> >> Sequential
> >> If you see the absolute values, it is clear that JMH's blackhole is
> >> preventing any JVM optimization on the loop. I think thats fine, as it
> >> simulates several calls to the vector on a *not omptimized* scenario.
> >> It seems that the JVM is not smart enough to optimize offheap sequential
> >> as much as it does with heap structures. Although both offheap
> >> implementations are worse than the heap version, the one that uses
> Arrow is
> >> sensible worse than the one that directly uses ByteBuffers:
> >>         1K      1M      64M
> >> Array   6.335   4.563   3.145
> >> Arrow   2.664   2.453   1.989
> >> Buf     4.456   3.971   3.018
> >> [image: Imágenes integradas 2]
> >>
> >> Sum
> >> The result is awful. It seems that the JVM is able to optimize (I guess
> >> vectorizing) the heap and ByteBuffer implementation (at least with small
> >> vectors), but not in the case with the Arrow version. I guess it is due
> to
> >> the indirections and deeper stack required to execute the same code on
> >> Arrow.
> >>
> >>         1K      1M      64M
> >> Array   44.833  26.617  9.787
> >> Arrow   3.426   3.265   2.521
> >> Buf     38.288  19.295  5.668
> >> [image: Imágenes integradas 4]
> >>
> >>
> >
>

Re: A simple benchmark on Java implementation

Reply via email to