On Fri, 23 Aug 2024 06:09:48 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:

>> Hi All,
>> 
>> As per the discussion on panama-dev mailing list[1], patch adds the support 
>> for following new two vector permutation APIs.
>> 
>> 
>> Declaration:-
>>     Vector<E>.selectFrom(Vector<E> v1, Vector<E> v2)
>> 
>> 
>> Semantics:-
>>     Using index values stored in the lanes of "this" vector, assemble the 
>> values stored in first (v1) and second (v2) vector arguments. Thus, first 
>> and second vector serves as a table, whose elements are selected based on 
>> index value vector. API is applicable to all integral and floating-point 
>> types.  The result of this operation is semantically equivalent to 
>> expression v1.rearrange(this.toShuffle(), v2). Values held in index vector 
>> lanes must lie within valid two vector index range [0, 2*VLEN) else an 
>> IndexOutOfBoundException is thrown.  
>> 
>> Summary of changes:
>> -  Java side implementation of new selectFrom API.
>> -  C2 compiler IR and inline expander changes.
>> -  In absence of direct two vector permutation instruction in target ISA, a 
>> lowering transformation dismantles new IR into constituent IR supported by 
>> target platforms. 
>> -  Optimized x86 backend implementation for AVX512 and legacy target.
>> -  Function tests covering new API.
>> 
>> JMH micro included with this patch shows around 10-15x gain over existing 
>> rearrange API :-
>> Test System: Intel(R) Xeon(R) Platinum 8480+ [ Sapphire Rapids Server]
>> 
>> 
>>   Benchmark                                     (size)   Mode  Cnt      
>> Score   Error   Units
>> SelectFromBenchmark.rearrangeFromByteVector     1024  thrpt    2   2041.762  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromByteVector     2048  thrpt    2   1028.550  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromIntVector      1024  thrpt    2    962.605  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromIntVector      2048  thrpt    2    479.004  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromLongVector     1024  thrpt    2    359.758  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromLongVector     2048  thrpt    2    178.192  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromShortVector    1024  thrpt    2   1463.459  
>>         ops/ms
>> SelectFromBenchmark.rearrangeFromShortVector    2048  thrpt    2    727.556  
>>         ops/ms
>> SelectFromBenchmark.selectFromByteVector        1024  thrpt    2  33254.830  
>>         ops/ms
>> SelectFromBenchmark.selectFromByteVector        2048  thrpt    2  17313.174  
>>         ops/ms
>> SelectFromBenchmark.selectFromIntVector         1024  thrpt    2  10756.804  
>>         ops/ms
>> S...
>
> Jatin Bhateja has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Removing redundant checkIndex routine

API changes look good. (Note at the moment we are not proposing to change how 
shuffles works - as you point out the two vector `selectFrom` and `rearrange` 
differ in the index representation.)

IIUC if the more direct two-table instruction is not available you fall back to 
calling two single arg rearranges with a blend, as a lowering transformation, 
similar to the fallback Java expression.

The float/double conversion bothers me, not suggesting we do something about it 
here, noting down for any future conversation on shuffles. Ideally we would 
want the equivalent integral vector (int or long) to represent the index, 
tricky to express in the API, or alternative treat as a bitwise no-op 
conversion (there is also impact on `toShuffle` too).

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20508#issuecomment-2307886044

Reply via email to