On Wed, 20 Nov 2024 14:46:46 GMT, Bhavana Kilambi <bkila...@openjdk.org> wrote:

>> Hi All,
>> 
>> This patch adds C2 compiler support for various Float16 operations added by 
>> [PR#22128](https://github.com/openjdk/jdk/pull/22128)
>> 
>> Following is the summary of changes included with this patch:-
>> 
>> 1. Detection of various Float16 operations through inline expansion or 
>> pattern folding idealizations.
>> 2. Float16 operations like add, sub, mul, div, max, and min are inferred 
>> through pattern folding idealization.
>> 3. Float16 SQRT and FMA operation are inferred through inline expansion and 
>> their corresponding entry points are defined in the newly added Float16Math 
>> class.
>>       -    These intrinsics receive unwrapped short arguments encoding IEEE 
>> 754 binary16 values.
>> 5. New specialized IR nodes for Float16 operations, associated 
>> idealizations, and constant folding routines.
>> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please 
>> refer to [FAQs 
>> ](https://github.com/openjdk/jdk/pull/21490#issuecomment-2482867818)for more 
>> details.
>> 7. Since Float16 uses short as its storage type, hence raw FP16 values are 
>> always loaded into general purpose register, but FP16 ISA instructions 
>> generally operate over floating point registers, therefore compiler injectes 
>> reinterpretation IR before and after Float16 operation nodes to move short 
>> value to floating point register and vice versa.
>> 8. New idealization routines to optimize redundant reinterpretation chains. 
>> HF2S + S2HF = HF
>> 6. Auto-vectorization of newly supported scalar operations.
>> 7. X86 and AARCH64 backend implementation for all supported intrinsics.
>> 9. Functional and Performance validation tests.
>> 
>> **Missing Pieces:-**
>> **-  AARCH64 Backend.**
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java 
> line 44:
> 
>> 42:     @Test
>> 43:     @IR(applyIfCPUFeatureAnd = {"avx512_fp16", "false", "avx512vl", 
>> "true"}, counts = {IRNode.VECTOR_CAST_HF2F, IRNode.VECTOR_SIZE_ANY, ">= 1", 
>> IRNode.VECTOR_CAST_F2HF, IRNode.VECTOR_SIZE_ANY, " >= 1"})
>> 44:     @IR(applyIfCPUFeatureAnd = {"avx512_fp16", "false", "f16c", "true"}, 
>> counts = {IRNode.VECTOR_CAST_HF2F, IRNode.VECTOR_SIZE_ANY, ">= 1", 
>> IRNode.VECTOR_CAST_F2HF, IRNode.VECTOR_SIZE_ANY, " >= 1"})
> 
> Wouldn't the Ideal transforms convert the IR for this test case to - 
> 
> ReinterpretS2HF     ReinterpretS2HF
>                       \         /
>                        AddHF
>                             |
>                        ReinterpretHF2S
>                             |
>                        ConvHF2F
> 
> in which case, ConvF2HF won't match?

New transforms are guarded by target features checks, the IR test rules are 
enforced only on non AVX512_FP16 targets.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1850469049

Reply via email to