> Hi All,
> 
> As per the discussion on panama-dev mailing list[1], patch adds the support 
> for following new two vector permutation APIs.
> 
> 
> Declaration:-
>     Vector<E>.selectFrom(Vector<E> v1, Vector<E> v2)
> 
> 
> Semantics:-
>     Using index values stored in the lanes of "this" vector, assemble the 
> values stored in first (v1) and second (v2) vector arguments. Thus, first and 
> second vector serves as a table, whose elements are selected based on index 
> value vector. API is applicable to all integral and floating-point types.  
> The result of this operation is semantically equivalent to expression 
> v1.rearrange(this.toShuffle(), v2). Values held in index vector lanes must 
> lie within valid two vector index range [0, 2*VLEN) else an 
> IndexOutOfBoundException is thrown.  
> 
> Summary of changes:
> -  Java side implementation of new selectFrom API.
> -  C2 compiler IR and inline expander changes.
> -  In absence of direct two vector permutation instruction in target ISA, a 
> lowering transformation dismantles new IR into constituent IR supported by 
> target platforms. 
> -  Optimized x86 backend implementation for AVX512 and legacy target.
> -  Function tests covering new API.
> 
> JMH micro included with this patch shows around 10-15x gain over existing 
> rearrange API :-
> Test System: Intel(R) Xeon(R) Platinum 8480+ [ Sapphire Rapids Server]
> 
> 
>   Benchmark                                     (size)   Mode  Cnt      Score 
>   Error   Units
> SelectFromBenchmark.rearrangeFromByteVector     1024  thrpt    2   2041.762   
>        ops/ms
> SelectFromBenchmark.rearrangeFromByteVector     2048  thrpt    2   1028.550   
>        ops/ms
> SelectFromBenchmark.rearrangeFromIntVector      1024  thrpt    2    962.605   
>        ops/ms
> SelectFromBenchmark.rearrangeFromIntVector      2048  thrpt    2    479.004   
>        ops/ms
> SelectFromBenchmark.rearrangeFromLongVector     1024  thrpt    2    359.758   
>        ops/ms
> SelectFromBenchmark.rearrangeFromLongVector     2048  thrpt    2    178.192   
>        ops/ms
> SelectFromBenchmark.rearrangeFromShortVector    1024  thrpt    2   1463.459   
>        ops/ms
> SelectFromBenchmark.rearrangeFromShortVector    2048  thrpt    2    727.556   
>        ops/ms
> SelectFromBenchmark.selectFromByteVector        1024  thrpt    2  33254.830   
>        ops/ms
> SelectFromBenchmark.selectFromByteVector        2048  thrpt    2  17313.174   
>        ops/ms
> SelectFromBenchmark.selectFromIntVector         1024  thrpt    2  10756.804   
>        ops/ms
> SelectFromBenchmark.selectFromIntVector         2048  thrpt    2   5398.2...

Jatin Bhateja has updated the pull request incrementally with one additional 
commit since the last revision:

  Adding descriptive comments

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/20508/files
  - new: https://git.openjdk.org/jdk/pull/20508/files/408a8694..8d71f175

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=20508&range=06
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20508&range=05-06

  Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/20508.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/20508/head:pull/20508

PR: https://git.openjdk.org/jdk/pull/20508

Reply via email to