On Mon, 6 Mar 2023 23:54:44 GMT, Vladimir Kozlov <k...@openjdk.org> wrote:

>> Implemented `Float.floatToFloat16` and `Float.float16ToFloat` intrinsics in 
>> Interpreter and C1 compiler to produce the same results as C2 intrinsics on 
>> x64, Aarch64 and RISC-V - all platforms where C2 intrinsics for these Java 
>> methods were implemented originally.
>> 
>> Replaced `SharedRuntime::f2hf()` and `hf2f()` C runtime functions with calls 
>> to runtime stubs which use the same HW instructions as C2 intrinsics. Only 
>> for 64-bit x64 because 32-bit x86 stub does not work: result is passed 
>> through FPU register and NaN values become different from C2 intrinsic. This 
>> runtime stub is only used to calculate constant values during C2 compilation 
>> and can be skipped.
>> 
>> I added new tests based on Tobias's `TestAll.java` And copied 
>> `jdk/lang/Float/Binary16Conversion*.java` tests to run them with `-Xcomp` to 
>> make sure code is compiled by C1 or C2. I modified 
>> `Binary16ConversionNaN.java` to compare results from Interpreter, C1 and C2.
>> 
>> Tested tier1-5, Xcomp, stress
>
> @fyang, please help to verify that new tests passed on RISC-V with these 
> changes and review these changes. Thanks!
> 
> I tested x86 (64- and 32-bit) and AArch64.

@vnkozlov Thanks a lot for taking this up. Is the following in the PR 
description still true:
"Replaced SharedRuntime::f2hf() and hf2f() C runtime functions with calls to 
runtime stubs which use the same HW instructions as C2 intrinsics. Only for 
64-bit x64 because 32-bit x86 stub does not work: result is passed through FPU 
register and NaN values become different from C2 intrinsic."
>From the PR it looks to me that for x86_64 you have the changes in place for 
>SharedRuntime and the same result is produced across SharedRuntime, 
>interpreter, c1, and c2.
For x86 32-bit also things are consistent across. Only the SharedRuntime 
optimization doesnt happen for x86 32bit as StubRoutines::hf2f() and 
StubRoutines::f2hf() are set as null. The fallback is handled correctly in 
interpreter, c1, and c2.

-------------

PR: https://git.openjdk.org/jdk/pull/12869

Reply via email to