On Tue, 23 Jan 2024 11:56:58 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:
>> Hi, >> >> Patch optimizes non-subword vector compress and expand APIs for x86 AVX2 >> only targets. >> Upcoming E-core Xeons (Sierra Forest) and Hybrid CPUs only support AVX2 >> instruction set. >> These are very frequently used APIs in columnar database filter operation. >> >> Implementation uses a lookup table to record permute indices. Table index is >> computed using >> mask argument of compress/expand operation. >> >> Following are the performance number of JMH micro included with the patch. >> >> >> System : Intel(R) Xeon(R) Platinum 8480+ (Sapphire Rapids) >> >> Baseline: >> Benchmark (size) Mode Cnt Score >> Error Units >> ColumnFilterBenchmark.filterDoubleColumn 1024 thrpt 2 142.767 >> ops/ms >> ColumnFilterBenchmark.filterDoubleColumn 2047 thrpt 2 71.436 >> ops/ms >> ColumnFilterBenchmark.filterDoubleColumn 4096 thrpt 2 35.992 >> ops/ms >> ColumnFilterBenchmark.filterFloatColumn 1024 thrpt 2 182.151 >> ops/ms >> ColumnFilterBenchmark.filterFloatColumn 2047 thrpt 2 91.096 >> ops/ms >> ColumnFilterBenchmark.filterFloatColumn 4096 thrpt 2 44.757 >> ops/ms >> ColumnFilterBenchmark.filterIntColumn 1024 thrpt 2 184.099 >> ops/ms >> ColumnFilterBenchmark.filterIntColumn 2047 thrpt 2 91.981 >> ops/ms >> ColumnFilterBenchmark.filterIntColumn 4096 thrpt 2 45.170 >> ops/ms >> ColumnFilterBenchmark.filterLongColumn 1024 thrpt 2 148.017 >> ops/ms >> ColumnFilterBenchmark.filterLongColumn 2047 thrpt 2 73.516 >> ops/ms >> ColumnFilterBenchmark.filterLongColumn 4096 thrpt 2 36.844 >> ops/ms >> >> Withopt: >> Benchmark (size) Mode Cnt Score >> Error Units >> ColumnFilterBenchmark.filterDoubleColumn 1024 thrpt 2 2051.707 >> ops/ms >> ColumnFilterBenchmark.filterDoubleColumn 2047 thrpt 2 914.072 >> ops/ms >> ColumnFilterBenchmark.filterDoubleColumn 4096 thrpt 2 489.898 >> ops/ms >> ColumnFilterBenchmark.filterFloatColumn 1024 thrpt 2 5324.195 >> ops/ms >> ColumnFilterBenchmark.filterFloatColumn 2047 thrpt 2 2587.229 >> ops/ms >> ColumnFilterBenchmark.filterFloatColumn 4096 thrpt 2 1278.665 >> ops/ms >> ColumnFilterBenchmark.filterIntColumn 1024 thrpt 2 4149.384 >> ops/ms >> ColumnFilterBenchmark.filterIntColumn 2047 thrpt ... > > Jatin Bhateja has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains 10 additional > commits since the last revision: > > - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8322768 > - Modifying comments. > - Review comments resolution > - Modified code comment for clarity. > - Space fixup > - Using emulated variable blend E-Core optimized instruction. > - Review suggestions incorporated. > - Review comments resolutions. > - Updating copyright year of modified files. > - 8322768: Optimize non-subword vector compress and expand APIs for AVX2 > target. Testing passed, looks good now :) Nice progress, the code now is simpler and much more understandable! ------------- Marked as reviewed by epeter (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17261#pullrequestreview-1843198049