On Wed, 20 Nov 2024 14:46:46 GMT, Bhavana Kilambi <bkila...@openjdk.org> wrote:
>> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by >> [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection of various Float16 operations through inline expansion or >> pattern folding idealizations. >> 2. Float16 operations like add, sub, mul, div, max, and min are inferred >> through pattern folding idealization. >> 3. Float16 SQRT and FMA operation are inferred through inline expansion and >> their corresponding entry points are defined in the newly added Float16Math >> class. >> - These intrinsics receive unwrapped short arguments encoding IEEE >> 754 binary16 values. >> 5. New specialized IR nodes for Float16 operations, associated >> idealizations, and constant folding routines. >> 6. New Ideal type for constant and non-constant Float16 IR nodes. Please >> refer to [FAQs >> ](https://github.com/openjdk/jdk/pull/21490#issuecomment-2482867818)for more >> details. >> 7. Since Float16 uses short as its storage type, hence raw FP16 values are >> always loaded into general purpose register, but FP16 ISA instructions >> generally operate over floating point registers, therefore compiler injectes >> reinterpretation IR before and after Float16 operation nodes to move short >> value to floating point register and vice versa. >> 8. New idealization routines to optimize redundant reinterpretation chains. >> HF2S + S2HF = HF >> 6. Auto-vectorization of newly supported scalar operations. >> 7. X86 and AARCH64 backend implementation for all supported intrinsics. >> 9. Functional and Performance validation tests. >> >> **Missing Pieces:-** >> **- AARCH64 Backend.** >> >> Kindly review and share your feedback. >> >> Best Regards, >> Jatin > > test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java > line 44: > >> 42: @Test >> 43: @IR(applyIfCPUFeatureAnd = {"avx512_fp16", "false", "avx512vl", >> "true"}, counts = {IRNode.VECTOR_CAST_HF2F, IRNode.VECTOR_SIZE_ANY, ">= 1", >> IRNode.VECTOR_CAST_F2HF, IRNode.VECTOR_SIZE_ANY, " >= 1"}) >> 44: @IR(applyIfCPUFeatureAnd = {"avx512_fp16", "false", "f16c", "true"}, >> counts = {IRNode.VECTOR_CAST_HF2F, IRNode.VECTOR_SIZE_ANY, ">= 1", >> IRNode.VECTOR_CAST_F2HF, IRNode.VECTOR_SIZE_ANY, " >= 1"}) > > Wouldn't the Ideal transforms convert the IR for this test case to - > > ReinterpretS2HF ReinterpretS2HF > \ / > AddHF > | > ReinterpretHF2S > | > ConvHF2F > > in which case, ConvF2HF won't match? New transforms are guarded by target features checks, the IR test rules are enforced only on non AVX512_FP16 targets. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/21490#discussion_r1850469049