On Thu, Jul 1, 2021 at 1:48 PM liuhongt <hongtao....@intel.com> wrote: > > From: "Guo, Xuepeng" <xuepeng....@intel.com> > > gcc/ChangeLog: > > * common/config/i386/cpuinfo.h (get_available_features): > Detect FEATURE_AVX512FP16. > * common/config/i386/i386-common.c > (OPTION_MASK_ISA_AVX512FP16_SET, > OPTION_MASK_ISA_AVX512FP16_UNSET, > OPTION_MASK_ISA2_AVX512FP16_SET, > OPTION_MASK_ISA2_AVX512FP16_UNSET): New. > (OPTION_MASK_ISA2_AVX512BW_UNSET, > OPTION_MASK_ISA2_AVX512BF16_UNSET): Add AVX512FP16. > (ix86_handle_option): Handle -mavx512fp16. > * common/config/i386/i386-cpuinfo.h (enum processor_features): > Add FEATURE_AVX512FP16. > * common/config/i386/i386-isas.h: Add entry for AVX512FP16. > * config.gcc: Add avx512fp16intrin.h. > * config/i386/avx512fp16intrin.h: New intrinsic header. > * config/i386/cpuid.h: Add bit_AVX512FP16. > * config/i386/i386-builtin-types.def: (FLOAT16): New primitive type. > (UINT8): Ditto. > (V8HF): New vector type. > * config/i386/i386-builtins.c: Support _Float16 type for i386 backend. > * config/i386/i386-c.c (ix86_target_macros_internal): Define > __AVX512FP16__. > (ix86_target_macros): Undefine all _Float16 macros when AVX512FP16 is > disabled. > * config/i386/i386-expand.c (ix86_expand_move): Issue error when > using HFmode without AVX512FP16 enabled. > (ix86_expand_branch): Support HFmode. > * config/i386/i386-isa.def: Add PTA define for AVX512FP16. > * config/i386/i386-modes.def: Add HFmode. > * config/i386/i386-options.c (isa2_opts): Add -mavx512fp16. > (ix86_valid_target_attribute_inner_p): Add avx512fp16 attribute. > (ix86_option_override_internal): Enable SSE math for AVX512FP16. > * config/i386/i386.c (classify_argument): Add HFmode and > HCmode. > (construct_container): Avoid HCmode. > (function_value_32): Set return register to xmm0 for HF/HCmode. > (function_value_64): Add HFmode and HCmode. > (ix86_get_ssemov): Use vmovdqu16/vmovw/vmovsh for HFmode/HImode > scalar or vector. > (ix86_print_operand): Update output for HFmode constant. > (output_387_binary_op): Update instruction suffix for HFmode. > (sse_store_index): Use SFmode cost for HFmode cost. > (inline_memory_move_cost): Add HFmode, and perfer SSE cost over > GPR cost for HFmode. > (ix86_hard_regno_mode_ok): Allow HFmode. > (ix86_set_reg_reg_cost): Support cost for FP16 modes. > (ix86_scalar_mode_supported_p): Add HFmode. > (ix86_libgcc_floating_mode_supported_p): New function for > TARGET_LIBGCC_FLOATING_POINT_MODE_SUPPORTED_P hook. > (ix86_mangle_type): Add manlging for _Float16 type. > (ix86_get_excess_precision): Set FLT_EVAL_METHOD for AVX512FP16. > (ix86_can_inline_p): Skip fmpath check when AVX512FP16 enabled. > (TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): Define. > * config/i386/i386.h (VALID_AVX512FP16_REG_MODE): New. > (VALID_SSE_REG_MODE): Add HFmode. > (VALID_FP_MODE_P): Add HFmode and HCmode. > (SSE_FLOAT_MODE_P): Add HFmode. > (PTA_SAPPHIRERAPIDS): Add PTA_AVX512FP16. > * config/i386/i386.md (mode): Add HFmode. > (MODE_SIZE): Add HFmode. > (MODESH): New mode iterator. > (MODEFH): Likewise. > (X87MODEFH): Likewise. > (ssemodesuffix): Add sh suffix for HFmode. > (cbranch<mode>4): Use MODEFH. > (<insn><mode>3): Likewise. > (mul<mode>3): Likewise. > (div<mode>3): Likewise. > (*ieee_s<ieee_maxmin><mode>3): Likewise. > (*cmpi<unord>hf): New define_insn for HFmode. > (*pushhf_rex64): Likewise. > (*pushhf): Likewise. > (*movhf_internal): Likewise. > (extendhf<mode>2): Likewise. > (trunc<mode>hf2): Likewise. > (*fop_hf_comm): Likewise. > (*fop_hf_1): Likewise. > (float<floatunssuffix><mode>hf2): Likewise. > (define_split): Use MODESH. > (mov<mode>): Use X87MODEFH. > (mov<mode>cc): Likewise. > * config/i386/i386.opt: Add mavx512fp16. > * config/i386/immintrin.h: Include avx512fp16intrin.h. > * config/i386/sse.md (VFH_128): New mode iterator. > (sse): Add scalar and vector HFmodes. > (ssescalarmode): Add vector HFmode mapping. > (ssescalarmodesuffix): Add sh suffix for HFmode. > (*<sse>_vm<insn><mode>3): Use VFH_128. > (*<sse>_vm<multdiv_mnemonic><mode>3): Likewise. > (*ieee_<ieee_maxmin><mode>3): Likewise. > * doc/invoke.texi: Add mavx512fp16. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/avx-1.c: Add -mavx512fp16 in dg-options. > * gcc.target/i386/avx-2.c: Ditto. > * gcc.target/i386/avx512-check.h: Check cpuid for AVX512FP16. > * gcc.target/i386/funcspec-56.inc: Add new target attribute check. > * gcc.target/i386/sse-13.c: Add -mavx512fp16. > * gcc.target/i386/sse-14.c: Ditto. > * gcc.target/i386/sse-22.c: Ditto. > * gcc.target/i386/sse-23.c: Ditto. > * lib/target-supports.exp: (check_effective_target_avx512fp16): New. > * g++.target/i386/float16-1.C: New test. > * g++.target/i386/float16-2.C: Ditto. > * g++.target/i386/float16-3.C: Ditto. > * gcc.target/i386/avx512fp16-12a.c: Ditto. > * gcc.target/i386/avx512fp16-12b.c: Ditto. > * gcc.target/i386/float16-1.c: Ditto. > * gcc.target/i386/float16-2.c: Ditto. > * gcc.target/i386/float16-3a.c: Ditto. > * gcc.target/i386/float16-3b.c: Ditto. > * gcc.target/i386/float16-4a.c: Ditto. > * gcc.target/i386/float16-4b.c: Ditto. > * gcc.target/i386/pr54855-12.c: Ditto. > > Co-Authored-By: Guo, Xuepeng <xuepeng....@intel.com> > Co-Authored-By: H.J. Lu <hongjiu...@intel.com> > Co-Authored-By: Liu, Hongtao <hongtao....@intel.com> > Co-Authored-By: Wang, Hongyu <hongyu.w...@intel.com> > Co-Authored-By: Xu, Dianhong <dianhong...@intel.com> > --- > gcc/common/config/i386/cpuinfo.h | 2 + > gcc/common/config/i386/i386-common.c | 26 +- > gcc/common/config/i386/i386-cpuinfo.h | 1 + > gcc/common/config/i386/i386-isas.h | 1 + > gcc/config.gcc | 2 +- > gcc/config/i386/avx512fp16intrin.h | 53 ++++ > gcc/config/i386/cpuid.h | 1 + > gcc/config/i386/i386-builtin-types.def | 7 +- > gcc/config/i386/i386-builtins.c | 6 + > gcc/config/i386/i386-c.c | 20 ++ > gcc/config/i386/i386-expand.c | 8 + > gcc/config/i386/i386-isa.def | 1 + > gcc/config/i386/i386-modes.def | 1 + > gcc/config/i386/i386-options.c | 10 +- > gcc/config/i386/i386.c | 158 ++++++++++-- > gcc/config/i386/i386.h | 18 +- > gcc/config/i386/i386.md | 242 +++++++++++++++--- > gcc/config/i386/i386.opt | 4 + > gcc/config/i386/immintrin.h | 2 + > gcc/config/i386/sse.md | 42 +-- > gcc/doc/invoke.texi | 10 +- > gcc/testsuite/g++.target/i386/float16-1.C | 8 + > gcc/testsuite/g++.target/i386/float16-2.C | 14 + > gcc/testsuite/g++.target/i386/float16-3.C | 10 + > gcc/testsuite/gcc.target/i386/avx-1.c | 2 +- > gcc/testsuite/gcc.target/i386/avx-2.c | 2 +- > gcc/testsuite/gcc.target/i386/avx512-check.h | 3 + > .../gcc.target/i386/avx512fp16-12a.c | 21 ++ > .../gcc.target/i386/avx512fp16-12b.c | 27 ++ > gcc/testsuite/gcc.target/i386/float16-1.c | 8 + > gcc/testsuite/gcc.target/i386/float16-2.c | 14 + > gcc/testsuite/gcc.target/i386/float16-3a.c | 10 + > gcc/testsuite/gcc.target/i386/float16-3b.c | 10 + > gcc/testsuite/gcc.target/i386/float16-4a.c | 10 + > gcc/testsuite/gcc.target/i386/float16-4b.c | 10 + > gcc/testsuite/gcc.target/i386/funcspec-56.inc | 2 + > gcc/testsuite/gcc.target/i386/pr54855-12.c | 14 + > gcc/testsuite/gcc.target/i386/sse-13.c | 2 +- > gcc/testsuite/gcc.target/i386/sse-14.c | 2 +- > gcc/testsuite/gcc.target/i386/sse-22.c | 4 +- > gcc/testsuite/gcc.target/i386/sse-23.c | 2 +- > gcc/testsuite/lib/target-supports.exp | 13 +- > 42 files changed, 704 insertions(+), 99 deletions(-) > create mode 100644 gcc/config/i386/avx512fp16intrin.h > create mode 100644 gcc/testsuite/g++.target/i386/float16-1.C > create mode 100644 gcc/testsuite/g++.target/i386/float16-2.C > create mode 100644 gcc/testsuite/g++.target/i386/float16-3.C > create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-12a.c > create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-12b.c > create mode 100644 gcc/testsuite/gcc.target/i386/float16-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/float16-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/float16-3a.c > create mode 100644 gcc/testsuite/gcc.target/i386/float16-3b.c > create mode 100644 gcc/testsuite/gcc.target/i386/float16-4a.c > create mode 100644 gcc/testsuite/gcc.target/i386/float16-4b.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-12.c > > diff --git a/gcc/common/config/i386/cpuinfo.h > b/gcc/common/config/i386/cpuinfo.h > index 458f41de776..1835ac64e67 100644 > --- a/gcc/common/config/i386/cpuinfo.h > +++ b/gcc/common/config/i386/cpuinfo.h > @@ -731,6 +731,8 @@ get_available_features (struct __processor_model > *cpu_model, > set_feature (FEATURE_AVX5124FMAPS); > if (edx & bit_AVX512VP2INTERSECT) > set_feature (FEATURE_AVX512VP2INTERSECT); > + if (edx & bit_AVX512FP16) > + set_feature (FEATURE_AVX512FP16); > } > > __cpuid_count (7, 1, eax, ebx, ecx, edx); > diff --git a/gcc/common/config/i386/i386-common.c > b/gcc/common/config/i386/i386-common.c > index e156cc34584..197e9cd86b4 100644 > --- a/gcc/common/config/i386/i386-common.c > +++ b/gcc/common/config/i386/i386-common.c > @@ -82,6 +82,8 @@ along with GCC; see the file COPYING3. If not see > #define OPTION_MASK_ISA2_AVX5124VNNIW_SET OPTION_MASK_ISA2_AVX5124VNNIW > #define OPTION_MASK_ISA_AVX512VBMI2_SET \ > (OPTION_MASK_ISA_AVX512VBMI2 | OPTION_MASK_ISA_AVX512F_SET) > +#define OPTION_MASK_ISA_AVX512FP16_SET OPTION_MASK_ISA_AVX512BW_SET > +#define OPTION_MASK_ISA2_AVX512FP16_SET OPTION_MASK_ISA2_AVX512FP16 > #define OPTION_MASK_ISA_AVX512VNNI_SET \ > (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512F_SET) > #define OPTION_MASK_ISA2_AVXVNNI_SET OPTION_MASK_ISA2_AVXVNNI > @@ -231,6 +233,8 @@ along with GCC; see the file COPYING3. If not see > #define OPTION_MASK_ISA2_AVX5124FMAPS_UNSET OPTION_MASK_ISA2_AVX5124FMAPS > #define OPTION_MASK_ISA2_AVX5124VNNIW_UNSET OPTION_MASK_ISA2_AVX5124VNNIW > #define OPTION_MASK_ISA_AVX512VBMI2_UNSET OPTION_MASK_ISA_AVX512VBMI2 > +#define OPTION_MASK_ISA_AVX512FP16_UNSET OPTION_MASK_ISA_AVX512BW_UNSET > +#define OPTION_MASK_ISA2_AVX512FP16_UNSET OPTION_MASK_ISA2_AVX512FP16 > #define OPTION_MASK_ISA_AVX512VNNI_UNSET OPTION_MASK_ISA_AVX512VNNI > #define OPTION_MASK_ISA2_AVXVNNI_UNSET OPTION_MASK_ISA2_AVXVNNI > #define OPTION_MASK_ISA_AVX512VPOPCNTDQ_UNSET OPTION_MASK_ISA_AVX512VPOPCNTDQ > @@ -313,7 +317,8 @@ along with GCC; see the file COPYING3. If not see > (OPTION_MASK_ISA2_AVX512BF16_UNSET \ > | OPTION_MASK_ISA2_AVX5124FMAPS_UNSET \ > | OPTION_MASK_ISA2_AVX5124VNNIW_UNSET \ > - | OPTION_MASK_ISA2_AVX512VP2INTERSECT_UNSET) > + | OPTION_MASK_ISA2_AVX512VP2INTERSECT_UNSET \ > + | OPTION_MASK_ISA2_AVX512FP16_UNSET) > #define OPTION_MASK_ISA2_GENERAL_REGS_ONLY_UNSET \ > (OPTION_MASK_ISA2_AVX512F_UNSET) > #define OPTION_MASK_ISA2_AVX_UNSET OPTION_MASK_ISA2_AVX2_UNSET > @@ -326,7 +331,9 @@ along with GCC; see the file COPYING3. If not see > (OPTION_MASK_ISA2_SSE3_UNSET | OPTION_MASK_ISA2_KL_UNSET) > #define OPTION_MASK_ISA2_SSE_UNSET OPTION_MASK_ISA2_SSE2_UNSET > > -#define OPTION_MASK_ISA2_AVX512BW_UNSET OPTION_MASK_ISA2_AVX512BF16_UNSET > +#define OPTION_MASK_ISA2_AVX512BW_UNSET \ > + (OPTION_MASK_ISA2_AVX512BF16_UNSET \ > + | OPTION_MASK_ISA2_AVX512FP16_UNSET) > > /* Set 1 << value as value of -malign-FLAG option. */ > > @@ -830,6 +837,21 @@ ix86_handle_option (struct gcc_options *opts, > } > return true; > > + case OPT_mavx512fp16: > + if (value) > + { > + opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_AVX512FP16_SET; > + opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA2_AVX512FP16_SET; > + opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512FP16_SET; > + opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512FP16_SET; > + } > + else > + { > + opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA2_AVX512FP16_UNSET; > + opts->x_ix86_isa_flags2_explicit |= > OPTION_MASK_ISA2_AVX512FP16_UNSET; > + } > + return true; > + > case OPT_mavx512vnni: > if (value) > { > diff --git a/gcc/common/config/i386/i386-cpuinfo.h > b/gcc/common/config/i386/i386-cpuinfo.h > index e68dd656046..4e0659fc7b2 100644 > --- a/gcc/common/config/i386/i386-cpuinfo.h > +++ b/gcc/common/config/i386/i386-cpuinfo.h > @@ -228,6 +228,7 @@ enum processor_features > FEATURE_AESKLE, > FEATURE_WIDEKL, > FEATURE_AVXVNNI, > + FEATURE_AVX512FP16, > CPU_FEATURE_MAX > }; > > diff --git a/gcc/common/config/i386/i386-isas.h > b/gcc/common/config/i386/i386-isas.h > index 898c18f3dda..a6783660278 100644 > --- a/gcc/common/config/i386/i386-isas.h > +++ b/gcc/common/config/i386/i386-isas.h > @@ -169,4 +169,5 @@ ISA_NAMES_TABLE_START > ISA_NAMES_TABLE_ENTRY("aeskle", FEATURE_AESKLE, P_NONE, NULL) > ISA_NAMES_TABLE_ENTRY("widekl", FEATURE_WIDEKL, P_NONE, "-mwidekl") > ISA_NAMES_TABLE_ENTRY("avxvnni", FEATURE_AVXVNNI, P_NONE, "-mavxvnni") > + ISA_NAMES_TABLE_ENTRY("avx512fp16", FEATURE_AVX512FP16, P_NONE, > "-mavx512fp16") > ISA_NAMES_TABLE_END > diff --git a/gcc/config.gcc b/gcc/config.gcc > index 0230bb88861..5b4f894185a 100644 > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -416,7 +416,7 @@ i[34567]86-*-* | x86_64-*-*) > tsxldtrkintrin.h amxtileintrin.h amxint8intrin.h > amxbf16intrin.h x86gprintrin.h uintrintrin.h > hresetintrin.h keylockerintrin.h avxvnniintrin.h > - mwaitintrin.h" > + mwaitintrin.h avx512fp16intrin.h" > ;; > ia64-*-*) > extra_headers=ia64intrin.h > diff --git a/gcc/config/i386/avx512fp16intrin.h > b/gcc/config/i386/avx512fp16intrin.h > new file mode 100644 > index 00000000000..38d63161ba6 > --- /dev/null > +++ b/gcc/config/i386/avx512fp16intrin.h > @@ -0,0 +1,53 @@ > +/* Copyright (C) 2019 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + Under Section 7 of GPL version 3, you are granted additional > + permissions described in the GCC Runtime Library Exception, version > + 3.1, as published by the Free Software Foundation. > + > + You should have received a copy of the GNU General Public License and > + a copy of the GCC Runtime Library Exception along with this program; > + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > + <http://www.gnu.org/licenses/>. */ > + > +#ifndef _IMMINTRIN_H_INCLUDED > +#error "Never use <avx512fp16intrin.h> directly; include <immintrin.h> > instead." > +#endif > + > +#ifndef __AVX512FP16INTRIN_H_INCLUDED > +#define __AVX512FP16INTRIN_H_INCLUDED > + > +#ifndef __AVX512FP16__ > +#pragma GCC push_options > +#pragma GCC target("avx512fp16") > +#define __DISABLE_AVX512FP16__ > +#endif /* __AVX512FP16__ */ > + > +/* Internal data types for implementing the intrinsics. */ > +typedef _Float16 __v8hf __attribute__ ((__vector_size__ (16))); > +typedef _Float16 __v16hf __attribute__ ((__vector_size__ (32))); > +typedef _Float16 __v32hf __attribute__ ((__vector_size__ (64))); > + > +/* The Intel API is flexible enough that we must allow aliasing with other > + vector types, and their scalar components. */ > +typedef _Float16 __m128h __attribute__ ((__vector_size__ (16), > __may_alias__)); > +typedef _Float16 __m256h __attribute__ ((__vector_size__ (32), > __may_alias__)); > +typedef _Float16 __m512h __attribute__ ((__vector_size__ (64), > __may_alias__)); > + > +#ifdef __DISABLE_AVX512FP16__ > +#undef __DISABLE_AVX512FP16__ > +#pragma GCC pop_options > +#endif /* __DISABLE_AVX512FP16__ */ > + > +#endif /* __AVX512FP16INTRIN_H_INCLUDED */ > diff --git a/gcc/config/i386/cpuid.h b/gcc/config/i386/cpuid.h > index aebc17c6827..82b8050028b 100644 > --- a/gcc/config/i386/cpuid.h > +++ b/gcc/config/i386/cpuid.h > @@ -126,6 +126,7 @@ > #define bit_AVX5124VNNIW (1 << 2) > #define bit_AVX5124FMAPS (1 << 3) > #define bit_AVX512VP2INTERSECT (1 << 8) > +#define bit_AVX512FP16 (1 << 23) > #define bit_IBT (1 << 20) > #define bit_UINTR (1 << 5) > #define bit_PCONFIG (1 << 18) > diff --git a/gcc/config/i386/i386-builtin-types.def > b/gcc/config/i386/i386-builtin-types.def > index 3ca313c19ec..eb5153002ae 100644 > --- a/gcc/config/i386/i386-builtin-types.def > +++ b/gcc/config/i386/i386-builtin-types.def > @@ -68,6 +68,7 @@ DEF_PRIMITIVE_TYPE (UINT8, unsigned_char_type_node) > DEF_PRIMITIVE_TYPE (UINT16, short_unsigned_type_node) > DEF_PRIMITIVE_TYPE (INT64, long_long_integer_type_node) > DEF_PRIMITIVE_TYPE (UINT64, long_long_unsigned_type_node) > +DEF_PRIMITIVE_TYPE (FLOAT16, float16_type_node) > DEF_PRIMITIVE_TYPE (FLOAT, float_type_node) > DEF_PRIMITIVE_TYPE (DOUBLE, double_type_node) > DEF_PRIMITIVE_TYPE (FLOAT80, float80_type_node) > @@ -84,6 +85,7 @@ DEF_VECTOR_TYPE (V8QI, QI) > # SSE vectors > DEF_VECTOR_TYPE (V2DF, DOUBLE) > DEF_VECTOR_TYPE (V4SF, FLOAT) > +DEF_VECTOR_TYPE (V8HF, FLOAT16) > DEF_VECTOR_TYPE (V2DI, DI) > DEF_VECTOR_TYPE (V4SI, SI) > DEF_VECTOR_TYPE (V8HI, HI) > @@ -1296,4 +1298,7 @@ DEF_FUNCTION_TYPE (UINT, UINT, V2DI, V2DI, PVOID) > DEF_FUNCTION_TYPE (UINT, UINT, V2DI, PVOID) > DEF_FUNCTION_TYPE (VOID, V2DI, V2DI, V2DI, UINT) > DEF_FUNCTION_TYPE (UINT8, PV2DI, V2DI, PCVOID) > -DEF_FUNCTION_TYPE (UINT8, PV2DI, PCV2DI, PCVOID) > \ No newline at end of file > +DEF_FUNCTION_TYPE (UINT8, PV2DI, PCV2DI, PCVOID) > + > +# FP16 builtins > +DEF_FUNCTION_TYPE (V8HF, V8HI) > diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c > index 204e2903126..826fa650f21 100644 > --- a/gcc/config/i386/i386-builtins.c > +++ b/gcc/config/i386/i386-builtins.c > @@ -1371,6 +1371,12 @@ ix86_init_builtin_types (void) > it. */ > lang_hooks.types.register_builtin_type (float128_type_node, "__float128"); > > + /* Provide the _Float16 type if needed so that it can be used in > + AVX512FP16 intrinsics. */ > + if (!maybe_get_identifier ("_Float16")) > + lang_hooks.types.register_builtin_type (float16_type_node, > + "_Float16"); > + > const_string_type_node > = build_pointer_type (build_qualified_type > (char_type_node, TYPE_QUAL_CONST)); > diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c > index 5ed0de006fb..d3704717b2a 100644 > --- a/gcc/config/i386/i386-c.c > +++ b/gcc/config/i386/i386-c.c > @@ -598,6 +598,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag, > def_or_undef (parse_in, "__PTWRITE__"); > if (isa_flag2 & OPTION_MASK_ISA2_AVX512BF16) > def_or_undef (parse_in, "__AVX512BF16__"); > + if (isa_flag2 & OPTION_MASK_ISA2_AVX512FP16) > + def_or_undef (parse_in, "__AVX512FP16__"); > if (TARGET_MMX_WITH_SSE) > def_or_undef (parse_in, "__MMX_WITH_SSE__"); > if (isa_flag2 & OPTION_MASK_ISA2_ENQCMD) > @@ -771,6 +773,24 @@ ix86_target_macros (void) > > cpp_define (parse_in, "__SIZEOF_FLOAT128__=16"); > > + if (!TARGET_AVX512FP16) > + { > + /* NB: _Float16 is always provided in the the C front-end for > + AVX512FP16 intrinsics. If AVX512FP16 isn't enabled, undef > + all _Float16 macros. */ > + cpp_undef (parse_in, "__FLT16_MANT_DIG__"); > + cpp_undef (parse_in, "__FLT16_DIG__"); > + cpp_undef (parse_in, "__FLT16_MIN_EXP__"); > + cpp_undef (parse_in, "__FLT16_MIN_10_EXP__"); > + cpp_undef (parse_in, "__FLT16_MAX_EXP__"); > + cpp_undef (parse_in, "__FLT16_MAX_10_EXP__"); > + cpp_undef (parse_in, "__FLT16_MAX__"); > + cpp_undef (parse_in, "__FLT16_EPSILON__"); > + cpp_undef (parse_in, "__FLT16_MIN__"); > + cpp_undef (parse_in, "__FLT16_DECIMAL_DIG__"); > + cpp_undef (parse_in, "__FLT16_DENORM_MIN__"); > + } > + > cpp_define_formatted (parse_in, "__ATOMIC_HLE_ACQUIRE=%d", > IX86_HLE_ACQUIRE); > cpp_define_formatted (parse_in, "__ATOMIC_HLE_RELEASE=%d", > IX86_HLE_RELEASE); > > diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c > index e9763eb5b3e..ab5f5b284c8 100644 > --- a/gcc/config/i386/i386-expand.c > +++ b/gcc/config/i386/i386-expand.c > @@ -197,6 +197,13 @@ ix86_expand_move (machine_mode mode, rtx operands[]) > rtx tmp, addend = NULL_RTX; > enum tls_model model; > > + /* NB: HFmode is always enabled so that the _Float16 type can be > + used for AVX512FP16 intrinsics. We will issue an error here > + if AVX512FP16 isn't available. */ > + if (mode == HFmode && !TARGET_AVX512FP16) > + fatal_error (input_location, > + "%<_Float16%> is not supported on this target"); > + > op0 = operands[0]; > op1 = operands[1]; > > @@ -2132,6 +2139,7 @@ ix86_expand_branch (enum rtx_code code, rtx op0, rtx > op1, rtx label) > > switch (mode) > { > + case E_HFmode: > case E_SFmode: > case E_DFmode: > case E_XFmode: > diff --git a/gcc/config/i386/i386-isa.def b/gcc/config/i386/i386-isa.def > index a0d46cbc892..83d9302ea3d 100644 > --- a/gcc/config/i386/i386-isa.def > +++ b/gcc/config/i386/i386-isa.def > @@ -108,3 +108,4 @@ DEF_PTA(HRESET) > DEF_PTA(KL) > DEF_PTA(WIDEKL) > DEF_PTA(AVXVNNI) > +DEF_PTA(AVX512FP16) > diff --git a/gcc/config/i386/i386-modes.def b/gcc/config/i386/i386-modes.def > index 4e7014be034..9232f59a925 100644 > --- a/gcc/config/i386/i386-modes.def > +++ b/gcc/config/i386/i386-modes.def > @@ -23,6 +23,7 @@ along with GCC; see the file COPYING3. If not see > > FRACTIONAL_FLOAT_MODE (XF, 80, 12, ieee_extended_intel_96_format); > FLOAT_MODE (TF, 16, ieee_quad_format); > +FLOAT_MODE (HF, 2, ieee_half_format); > > /* In ILP32 mode, XFmode has size 12 and alignment 4. > In LP64 mode, XFmode has size and alignment 16. */ > diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c > index 0eccb549c22..b7b6f68af56 100644 > --- a/gcc/config/i386/i386-options.c > +++ b/gcc/config/i386/i386-options.c > @@ -223,7 +223,8 @@ static struct ix86_target_opts isa2_opts[] = > { "-mhreset", OPTION_MASK_ISA2_HRESET }, > { "-mkl", OPTION_MASK_ISA2_KL }, > { "-mwidekl", OPTION_MASK_ISA2_WIDEKL }, > - { "-mavxvnni", OPTION_MASK_ISA2_AVXVNNI } > + { "-mavxvnni", OPTION_MASK_ISA2_AVXVNNI }, > + { "-mavx512fp16", OPTION_MASK_ISA2_AVX512FP16 } > }; > static struct ix86_target_opts isa_opts[] = > { > @@ -1045,6 +1046,7 @@ ix86_valid_target_attribute_inner_p (tree fndecl, tree > args, char *p_strings[], > IX86_ATTR_ISA ("amx-bf16", OPT_mamx_bf16), > IX86_ATTR_ISA ("hreset", OPT_mhreset), > IX86_ATTR_ISA ("avxvnni", OPT_mavxvnni), > + IX86_ATTR_ISA ("avx512fp16", OPT_mavx512fp16), > > /* enum options */ > IX86_ATTR_ENUM ("fpmath=", OPT_mfpmath_), > @@ -2495,6 +2497,12 @@ ix86_option_override_internal (bool main_args_p, > else > opts->x_ix86_fpmath = TARGET_FPMATH_DEFAULT_P (opts->x_ix86_isa_flags); > > + if (TARGET_AVX512FP16 && (opts->x_ix86_fpmath & FPMATH_SSE) == 0) > + { > + opts->x_ix86_fpmath = (fpmath_unit) (opts->x_ix86_fpmath > + | FPMATH_SSE); > + } > + > /* Use external vectorized library in vectorizing intrinsics. */ > if (opts_set->x_ix86_veclibabi_type) > switch (opts->x_ix86_veclibabi_type) > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index a93128fa0a4..9ca31e934ab 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -604,6 +604,10 @@ ix86_can_inline_p (tree caller, tree callee) > ret = false; > > else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath > + /* AVX512FP16 will always enable ssemath since there's > + no x87 intrutions for HFmode. > + This is for -m32 -mavx512fp16 when fpmath=x87 default. */ > + && ! TARGET_AVX512FP16 > /* If the calle doesn't use FP expressions differences in > ix86_fpmath can be ignored. We are called from FEs > for multi-versioning call optimization, so beware of > @@ -2350,6 +2354,7 @@ classify_argument (machine_mode mode, const_tree type, > gcc_unreachable (); > case E_CTImode: > return 0; > + case E_HFmode: > case E_SFmode: > if (!(bit_offset % 64)) > classes[0] = X86_64_SSESF_CLASS; > @@ -2367,6 +2372,7 @@ classify_argument (machine_mode mode, const_tree type, > classes[0] = X86_64_SSE_CLASS; > classes[1] = X86_64_SSEUP_CLASS; > return 2; > + case E_HCmode: > case E_SCmode: > classes[0] = X86_64_SSE_CLASS; > if (!(bit_offset % 64)) > @@ -2578,9 +2584,9 @@ construct_container (machine_mode mode, machine_mode > orig_mode, > return NULL; > } > > - /* First construct simple cases. Avoid SCmode, since we want to use > - single register to pass this type. */ > - if (n == 1 && mode != SCmode) > + /* First construct simple cases. Avoid HCmode and SCmode, since we > + want to use single register to pass these types. */ > + if (n == 1 && mode != HCmode && mode != SCmode) > switch (regclass[0]) > { > case X86_64_INTEGER_CLASS: > @@ -3896,6 +3902,10 @@ function_value_32 (machine_mode orig_mode, > machine_mode mode, > else if (VECTOR_MODE_P (mode) && GET_MODE_SIZE (mode) == 64) > regno = FIRST_SSE_REG; > > + /* _Float16 return values in %xmm0. */ > + else if (mode == HFmode || mode == HCmode) > + regno = FIRST_SSE_REG; > + > /* Floating point return values in %st(0) (unless -mno-fp-ret-in-387). */ > else if (X87_FLOAT_MODE_P (mode) && TARGET_FLOAT_RETURNS_IN_80387) > regno = FIRST_FLOAT_REG; > @@ -3939,6 +3949,8 @@ function_value_64 (machine_mode orig_mode, machine_mode > mode, > > switch (mode) > { > + case E_HFmode: > + case E_HCmode: > case E_SFmode: > case E_SCmode: > case E_DFmode: > @@ -5303,7 +5315,12 @@ ix86_get_ssemov (rtx *operands, unsigned size, > switch (type) > { > case opcode_int: > - opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32"; > + if (scalar_mode == E_HFmode) > + opcode = (misaligned_p > + ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") > + : "vmovdqa64"); > + else > + opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32"; > break; > case opcode_float: > opcode = misaligned_p ? "vmovups" : "vmovaps"; > @@ -5317,6 +5334,11 @@ ix86_get_ssemov (rtx *operands, unsigned size, > { > switch (scalar_mode) > { > + case E_HFmode: > + opcode = (misaligned_p > + ? (TARGET_AVX512BW ? "vmovdqu16" : "vmovdqu64") > + : "vmovdqa64"); > + break; > case E_SFmode: > opcode = misaligned_p ? "%vmovups" : "%vmovaps"; > break; > @@ -5452,6 +5474,9 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands) > case MODE_SI: > return "%vmovd\t{%1, %0|%0, %1}"; > > + case MODE_HI: > + return "vmovw\t{%1, %0|%0, %1}"; > + > case MODE_DF: > if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) > return "vmovsd\t{%d1, %0|%0, %d1}"; > @@ -5464,6 +5489,12 @@ ix86_output_ssemov (rtx_insn *insn, rtx *operands) > else > return "%vmovss\t{%1, %0|%0, %1}"; > > + case MODE_HF: > + if (REG_P (operands[0]) && REG_P (operands[1])) > + return "vmovsh\t{%d1, %0|%0, %d1}"; > + else > + return "vmovsh\t{%1, %0|%0, %1}"; > + > case MODE_V1DF: > gcc_assert (!TARGET_AVX); > return "movlpd\t{%1, %0|%0, %1}"; > @@ -13411,6 +13442,15 @@ ix86_print_operand (FILE *file, rtx x, int code) > (file, addr, MEM_ADDR_SPACE (x), code == 'p' || code == 'P'); > } > > + else if (CONST_DOUBLE_P (x) && GET_MODE (x) == HFmode) > + { > + long l = real_to_target (NULL, CONST_DOUBLE_REAL_VALUE (x), > + REAL_MODE_FORMAT (HFmode)); > + if (ASSEMBLER_DIALECT == ASM_ATT) > + putc ('$', file); > + fprintf (file, "0x%04x", (unsigned int) l); > + } > + > else if (CONST_DOUBLE_P (x) && GET_MODE (x) == SFmode) > { > long l; > @@ -13901,7 +13941,9 @@ output_387_binary_op (rtx_insn *insn, rtx *operands) > > if (is_sse) > { > - p = (GET_MODE (operands[0]) == SFmode) ? "ss" : "sd"; > + p = (GET_MODE (operands[0]) == HFmode > + ? "sh" > + : (GET_MODE (operands[0]) == SFmode ? "ss" : "sd")); > strcat (buf, p); > > if (TARGET_AVX) > @@ -19157,21 +19199,26 @@ ix86_can_change_mode_class (machine_mode from, > machine_mode to, > static inline int > sse_store_index (machine_mode mode) > { > - switch (GET_MODE_SIZE (mode)) > - { > - case 4: > - return 0; > - case 8: > - return 1; > - case 16: > - return 2; > - case 32: > - return 3; > - case 64: > - return 4; > - default: > - return -1; > - } > + /* NB: Use SFmode cost for HFmode instead of adding HFmode load/store > + costs to processor_costs, which requires changes to all entries in > + processor cost table. */ > + if (mode == E_HFmode) > + mode = E_SFmode; > + switch (GET_MODE_SIZE (mode)) > + { > + case 4: > + return 0; > + case 8: > + return 1; > + case 16: > + return 2; > + case 32: > + return 3; > + case 64: > + return 4; > + default: > + return -1; > + } > } > > /* Return the cost of moving data of mode M between a > @@ -19198,6 +19245,7 @@ inline_memory_move_cost (machine_mode mode, enum > reg_class regclass, int in) > int index; > switch (mode) > { > + case E_HFmode: > case E_SFmode: > index = 0; > break; > @@ -19298,11 +19346,31 @@ inline_memory_move_cost (machine_mode mode, enum > reg_class regclass, int in) > } > break; > case 2: > - if (in == 2) > - return MAX (ix86_cost->hard_register.int_load[1], > - ix86_cost->hard_register.int_store[1]); > - return in ? ix86_cost->hard_register.int_load[1] > - : ix86_cost->hard_register.int_store[1]; > + { > + int cost; > + if (in == 2) > + cost = MAX (ix86_cost->hard_register.int_load[1], > + ix86_cost->hard_register.int_store[1]); > + else > + cost = in ? ix86_cost->hard_register.int_load[1] > + : ix86_cost->hard_register.int_store[1]; > + if (mode == E_HFmode) > + { > + /* Prefer SSE over GPR for HFmode. */ > + int sse_cost; > + int index = sse_store_index (mode); > + if (in == 2) > + sse_cost = MAX (ix86_cost->hard_register.sse_load[index], > + ix86_cost->hard_register.sse_store[index]); > + else > + sse_cost = (in > + ? ix86_cost->hard_register.sse_load [index] > + : ix86_cost->hard_register.sse_store [index]); > + if (sse_cost >= cost) > + cost = sse_cost + 1; > + } > + return cost; > + } > default: > if (in == 2) > cost = MAX (ix86_cost->hard_register.int_load[2], > @@ -19476,6 +19544,8 @@ ix86_hard_regno_mode_ok (unsigned int regno, > machine_mode mode) > - XI mode > - any of 512-bit wide vector mode > - any scalar mode. */ > + /* For AVX512FP16, vmovw supports movement of HImode > + between gpr and sse registser. */ > if (TARGET_AVX512F > && (mode == XImode > || VALID_AVX512F_REG_MODE (mode) > @@ -19539,6 +19609,8 @@ ix86_hard_regno_mode_ok (unsigned int regno, > machine_mode mode) > return true; > else if (VALID_FP_MODE_P (mode)) > return true; > + else if ((mode == HFmode || mode == HCmode) && TARGET_AVX512FP16) > + return true; > else if (VALID_DFP_MODE_P (mode)) > return true; > /* Lots of MMX code casts 8 byte vector modes to DImode. If we then go > @@ -19720,7 +19792,8 @@ ix86_set_reg_reg_cost (machine_mode mode) > > case MODE_VECTOR_INT: > case MODE_VECTOR_FLOAT: > - if ((TARGET_AVX512F && VALID_AVX512F_REG_MODE (mode)) > + if ((TARGET_AVX512FP16 && VALID_AVX512FP16_REG_MODE (mode)) > + || (TARGET_AVX512F && VALID_AVX512F_REG_MODE (mode)) > || (TARGET_AVX && VALID_AVX256_REG_MODE (mode)) > || (TARGET_SSE2 && VALID_SSE2_REG_MODE (mode)) > || (TARGET_SSE && VALID_SSE_REG_MODE (mode)) > @@ -21550,10 +21623,31 @@ ix86_scalar_mode_supported_p (scalar_mode mode) > return default_decimal_float_supported_p (); > else if (mode == TFmode) > return true; > + else if (mode == HFmode) > + /* NB: Always return TRUE for HFmode so that the _Float16 type will > + be defined by the C front-end for AVX512FP16 intrinsics. We will > + issue an error in ix86_expand_move for HFmode if AVX512FP16 isn't > + enabled. */ > + return true; > else > return default_scalar_mode_supported_p (mode); > } > > +/* Implement TARGET_LIBGCC_FLOATING_POINT_MODE_SUPPORTED_P - return TRUE > + if MODE is HFmode, and punt to the generic implementation otherwise. */ > + > +static bool > +ix86_libgcc_floating_mode_supported_p (scalar_float_mode mode) > +{ > + /* NB: Always return TRUE for HFmode so that the _Float16 type will > + be defined by the C front-end for AVX512FP16 intrinsics. We will > + issue an error in ix86_expand_move for HFmode if AVX512FP16 isn't > + enabled. */ > + return (mode == HFmode > + ? true > + : default_libgcc_floating_mode_supported_p (mode)); > +} > + > /* Implements target hook vector_mode_supported_p. */ > static bool > ix86_vector_mode_supported_p (machine_mode mode) > @@ -21842,6 +21936,10 @@ ix86_mangle_type (const_tree type) > > switch (TYPE_MODE (type)) > { > + case E_HFmode: > + /* _Float16 is "DF16_". > + Align with clang's decision in https://reviews.llvm.org/D33719. */ > + return "DF16_"; > case E_TFmode: > /* __float128 is "g". */ > return "g"; > @@ -23218,6 +23316,8 @@ ix86_get_excess_precision (enum excess_precision_type > type) > switch (type) > { > case EXCESS_PRECISION_TYPE_FAST: > + if (TARGET_AVX512FP16) > + return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16; > /* The fastest type to promote to will always be the native type, > whether that occurs with implicit excess precision or > otherwise. */ > @@ -23230,6 +23330,8 @@ ix86_get_excess_precision (enum excess_precision_type > type) > cases. */ > if (!TARGET_80387) > return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT; > + else if (TARGET_AVX512FP16) > + return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16; > else if (!TARGET_MIX_SSE_I387) > { > if (!(TARGET_SSE && TARGET_SSE_MATH)) > @@ -23795,6 +23897,10 @@ ix86_run_selftests (void) > #undef TARGET_SCALAR_MODE_SUPPORTED_P > #define TARGET_SCALAR_MODE_SUPPORTED_P ix86_scalar_mode_supported_p > > +#undef TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P > +#define TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P \ > + ix86_libgcc_floating_mode_supported_p > + > #undef TARGET_VECTOR_MODE_SUPPORTED_P > #define TARGET_VECTOR_MODE_SUPPORTED_P ix86_vector_mode_supported_p > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > index 6e0340a4b60..1e4733420a1 100644 > --- a/gcc/config/i386/i386.h > +++ b/gcc/config/i386/i386.h > @@ -990,7 +990,8 @@ extern const char *host_detect_local_cpu (int argc, const > char **argv); > > #define VALID_AVX512F_SCALAR_MODE(MODE) > \ > ((MODE) == DImode || (MODE) == DFmode || (MODE) == SImode \ > - || (MODE) == SFmode) > + || (MODE) == SFmode \ > + || (((MODE) == HImode || (MODE) == HFmode) && TARGET_AVX512FP16)) > > #define VALID_AVX512F_REG_MODE(MODE) \ > ((MODE) == V8DImode || (MODE) == V8DFmode || (MODE) == V64QImode \ > @@ -1005,6 +1006,9 @@ extern const char *host_detect_local_cpu (int argc, > const char **argv); > || (MODE) == V4SImode || (MODE) == V4SFmode || (MODE) == V8HImode \ > || (MODE) == TFmode || (MODE) == V1TImode) > > +#define VALID_AVX512FP16_REG_MODE(MODE) > \ > + ((MODE) == V8HFmode || (MODE) == V16HFmode || (MODE) == V32HFmode) > + > #define VALID_SSE2_REG_MODE(MODE) \ > ((MODE) == V16QImode || (MODE) == V8HImode || (MODE) == V2DFmode \ > || (MODE) == V4QImode || (MODE) == V2HImode \ > @@ -1032,7 +1036,7 @@ extern const char *host_detect_local_cpu (int argc, > const char **argv); > > #define VALID_FP_MODE_P(MODE) \ > ((MODE) == SFmode || (MODE) == DFmode || (MODE) == XFmode \ > - || (MODE) == SCmode || (MODE) == DCmode || (MODE) == XCmode) > \ > + || (MODE) == SCmode || (MODE) == DCmode || (MODE) == XCmode) > > #define VALID_INT_MODE_P(MODE) \ > ((MODE) == QImode || (MODE) == HImode > \ > @@ -1055,13 +1059,17 @@ extern const char *host_detect_local_cpu (int argc, > const char **argv); > || (MODE) == V4DImode || (MODE) == V8SFmode || (MODE) == V4DFmode \ > || (MODE) == V2TImode || (MODE) == V8DImode || (MODE) == V64QImode \ > || (MODE) == V16SImode || (MODE) == V32HImode || (MODE) == V8DFmode \ > - || (MODE) == V16SFmode) > + || (MODE) == V16SFmode \ > + || (((MODE) == V8HFmode || (MODE) == V16HFmode || (MODE) == V32HFmode) \ > + && TARGET_AVX512FP16)) > > #define X87_FLOAT_MODE_P(MODE) \ > (TARGET_80387 && ((MODE) == SFmode || (MODE) == DFmode || (MODE) == > XFmode)) > > #define SSE_FLOAT_MODE_P(MODE) \ > - ((TARGET_SSE && (MODE) == SFmode) || (TARGET_SSE2 && (MODE) == DFmode)) > + ((TARGET_AVX512FP16 && (MODE) == HFmode) \ > + || (TARGET_SSE && (MODE) == SFmode) \ > + || (TARGET_SSE2 && (MODE) == DFmode)) > > #define FMA4_VEC_FLOAT_MODE_P(MODE) \ > (TARGET_FMA4 && ((MODE) == V4SFmode || (MODE) == V2DFmode \ > @@ -2256,7 +2264,7 @@ constexpr wide_int_bitmask PTA_TIGERLAKE = > PTA_ICELAKE_CLIENT | PTA_MOVDIRI > constexpr wide_int_bitmask PTA_SAPPHIRERAPIDS = PTA_COOPERLAKE | PTA_MOVDIRI > | PTA_MOVDIR64B | PTA_AVX512VP2INTERSECT | PTA_ENQCMD | PTA_CLDEMOTE > | PTA_PTWRITE | PTA_WAITPKG | PTA_SERIALIZE | PTA_TSXLDTRK | PTA_AMX_TILE > - | PTA_AMX_INT8 | PTA_AMX_BF16 | PTA_UINTR | PTA_AVXVNNI; > + | PTA_AMX_INT8 | PTA_AMX_BF16 | PTA_UINTR | PTA_AVXVNNI | PTA_AVX512FP16; > constexpr wide_int_bitmask PTA_KNL = PTA_BROADWELL | PTA_AVX512PF > | PTA_AVX512ER | PTA_AVX512F | PTA_AVX512CD | PTA_PREFETCHWT1; > constexpr wide_int_bitmask PTA_BONNELL = PTA_CORE2 | PTA_MOVBE; > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 9b619e2f78f..ee5660e8161 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -496,7 +496,7 @@ (define_attr "type" > > ;; Main data type used by the insn > (define_attr "mode" > - "unknown,none,QI,HI,SI,DI,TI,OI,XI,SF,DF,XF,TF,V16SF,V8SF,V4DF,V4SF, > + "unknown,none,QI,HI,SI,DI,TI,OI,XI,HF,SF,DF,XF,TF,V16SF,V8SF,V4DF,V4SF, > V2DF,V2SF,V1DF,V8DF" > (const_string "unknown")) > > @@ -832,8 +832,7 @@ (define_attr "isa" > "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, > sse_noavx,sse2,sse2_noavx,sse3,sse3_noavx,sse4,sse4_noavx, > avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f, > avx512bw,noavx512bw,avx512dq,noavx512dq, > - avx512vl,noavx512vl, > - avxvnni,avx512vnnivl" > + avx512vl,noavx512vl,avxvnni,avx512vnnivl,avx512fp16" > (const_string "base")) > > ;; Define instruction set of MMX instructions > @@ -885,7 +884,8 @@ (define_attr "enabled" "" > (eq_attr "isa" "avxvnni") (symbol_ref "TARGET_AVXVNNI") > (eq_attr "isa" "avx512vnnivl") > (symbol_ref "TARGET_AVX512VNNI && TARGET_AVX512VL") > - > + (eq_attr "isa" "avx512fp16") > + (symbol_ref "TARGET_AVX512FP16") > (eq_attr "mmx_isa" "native") > (symbol_ref "!TARGET_MMX_WITH_SSE") > (eq_attr "mmx_isa" "sse") > @@ -1089,8 +1089,9 @@ (define_mode_iterator SWI48DWI [SI DI (TI > "TARGET_64BIT")]) > ;; compile time constant, it is faster to use <MODE_SIZE> than > ;; GET_MODE_SIZE (<MODE>mode). For XFmode which depends on > ;; command line options just use GET_MODE_SIZE macro. > -(define_mode_attr MODE_SIZE [(QI "1") (HI "2") (SI "4") (DI "8") (TI "16") > - (SF "4") (DF "8") (XF "GET_MODE_SIZE (XFmode)") > +(define_mode_attr MODE_SIZE [(QI "1") (HI "2") (SI "4") (DI "8") > + (TI "16") (HF "2") (SF "4") (DF "8") > + (XF "GET_MODE_SIZE (XFmode)") > (V16QI "16") (V32QI "32") (V64QI "64") > (V8HI "16") (V16HI "32") (V32HI "64") > (V4SI "16") (V8SI "32") (V16SI "64") > @@ -1222,13 +1223,22 @@ (define_mode_iterator MODEF [SF DF]) > ;; All x87 floating point modes > (define_mode_iterator X87MODEF [SF DF XF]) > > +;; SSE and x87 SFmode floating point mode and HFmode > +(define_mode_iterator MODESH [(HF "TARGET_AVX512FP16") SF]) > + > +;; SSE and x87 SFmode and DFmode floating point modes plus HFmode > +(define_mode_iterator MODEFH [(HF "TARGET_AVX512FP16") SF DF]) > + > +;; All x87 floating point modes plus HFmode > +(define_mode_iterator X87MODEFH [HF SF DF XF]) > + > ;; All SSE floating point modes > (define_mode_iterator SSEMODEF [SF DF TF]) > (define_mode_attr ssevecmodef [(SF "V4SF") (DF "V2DF") (TF "TF")]) > > ;; SSE instruction suffix for various modes > (define_mode_attr ssemodesuffix > - [(SF "ss") (DF "sd") > + [(HF "sh") (SF "ss") (DF "sd") > (V16SF "ps") (V8DF "pd") > (V8SF "ps") (V4DF "pd") > (V4SF "ps") (V2DF "pd") > @@ -1495,8 +1505,8 @@ (define_expand "cstorexf4" > > (define_expand "cbranch<mode>4" > [(set (reg:CC FLAGS_REG) > - (compare:CC (match_operand:MODEF 1 "cmp_fp_expander_operand") > - (match_operand:MODEF 2 "cmp_fp_expander_operand"))) > + (compare:CC (match_operand:MODEFH 1 "cmp_fp_expander_operand") > + (match_operand:MODEFH 2 "cmp_fp_expander_operand"))) > (set (pc) (if_then_else > (match_operator 0 "ix86_fp_comparison_operator" > [(reg:CC FLAGS_REG) > @@ -1702,6 +1712,17 @@ (define_insn "*cmpi<unord><MODEF:mode>" > (eq_attr "alternative" "0") > (symbol_ref "true") > (symbol_ref "false"))))]) > + > +(define_insn "*cmpi<unord>hf" > + [(set (reg:CCFP FLAGS_REG) > + (compare:CCFP > + (match_operand:HF 0 "register_operand" "v") > + (match_operand:HF 1 "register_ssemem_operand" "vm")))] > + "TARGET_AVX512FP16" > + "v<unord>comish\t{%1, %0|%0, %1}" > + [(set_attr "type" "ssecomi") > + (set_attr "prefix" "evex") > + (set_attr "mode" "HF")]) > > ;; Push/pop instructions. > > @@ -2433,8 +2454,8 @@ (define_insn "*movsi_internal" > (symbol_ref "true")))]) > > (define_insn "*movhi_internal" > - [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r ,r ,m ,*k,*k > ,*r,*m,*k") > - (match_operand:HI 1 "general_operand" "r > ,rn,rm,rn,*r,*km,*k,*k,CBC"))] > + [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r ,r ,m ,*k,*k > ,*r,*m,*k,?r,?v,*v,*v,*m") > + (match_operand:HI 1 "general_operand" "r > ,rn,rm,rn,*r,*km,*k,*k,CBC,v, r, v, m, v"))] > "!(MEM_P (operands[0]) && MEM_P (operands[1])) > && ix86_hardreg_mov_ok (operands[0], operands[1])" > > @@ -2460,6 +2481,9 @@ (define_insn "*movhi_internal" > gcc_unreachable (); > } > > + case TYPE_SSEMOV: > + return ix86_output_ssemov (insn, operands); > + > case TYPE_MSKLOG: > if (operands[1] == const0_rtx) > return "kxorw\t%0, %0, %0"; > @@ -2475,7 +2499,9 @@ (define_insn "*movhi_internal" > } > } > [(set (attr "type") > - (cond [(eq_attr "alternative" "4,5,6,7") > + (cond [(eq_attr "alternative" "9,10,11,12,13") > + (const_string "ssemov") > + (eq_attr "alternative" "4,5,6,7") > (const_string "mskmov") > (eq_attr "alternative" "8") > (const_string "msklog") > @@ -2500,6 +2526,8 @@ (define_insn "*movhi_internal" > (set (attr "mode") > (cond [(eq_attr "type" "imovx") > (const_string "SI") > + (eq_attr "alternative" "11") > + (const_string "HF") > (and (eq_attr "alternative" "1,2") > (match_operand:HI 1 "aligned_operand")) > (const_string "SI") > @@ -2508,7 +2536,12 @@ (define_insn "*movhi_internal" > (not (match_test "TARGET_HIMODE_MATH")))) > (const_string "SI") > ] > - (const_string "HI")))]) > + (const_string "HI"))) > + (set (attr "isa") > + (cond [(eq_attr "alternative" "9,10,11,12,13") > + (const_string "avx512fp16") > + ] > + (const_string "*")))]) > > ;; Situation is quite tricky about when to choose full sized (SImode) move > ;; over QImode moves. For Q_REG -> Q_REG move we use full size only for > @@ -3158,10 +3191,34 @@ (define_insn "*pushsf" > (set_attr "unit" "i387,*,*") > (set_attr "mode" "SF,SI,SF")]) > > +(define_insn "*pushhf_rex64" > + [(set (match_operand:HF 0 "push_operand" "=X,X") > + (match_operand:HF 1 "nonmemory_no_elim_operand" "r,x"))] > + "TARGET_64BIT && TARGET_AVX512FP16" > +{ > + /* Anything else should be already split before reg-stack. */ > + gcc_assert (which_alternative == 0); > + return "push{q}\t%q1"; > +} > + [(set_attr "type" "push,multi") > + (set_attr "mode" "DI,HF")]) > + > +(define_insn "*pushhf" > + [(set (match_operand:HF 0 "push_operand" "=X,X") > + (match_operand:HF 1 "general_no_elim_operand" "rmF,x"))] > + "!TARGET_64BIT && TARGET_AVX512FP16" > +{ > + /* Anything else should be already split before reg-stack. */ > + gcc_assert (which_alternative == 0); > + return "push{l}\t%k1"; > +} > + [(set_attr "type" "push,multi") > + (set_attr "mode" "SI,HF")]) > + > ;; %%% Kill this when call knows how to work this out. > (define_split > - [(set (match_operand:SF 0 "push_operand") > - (match_operand:SF 1 "any_fp_register_operand"))] > + [(set (match_operand:MODESH 0 "push_operand") > + (match_operand:MODESH 1 "any_fp_register_operand"))] > "reload_completed" > [(set (reg:P SP_REG) (plus:P (reg:P SP_REG) (match_dup 2))) > (set (match_dup 0) (match_dup 1))] > @@ -3209,8 +3266,8 @@ (define_expand "movtf" > "ix86_expand_move (TFmode, operands); DONE;") > > (define_expand "mov<mode>" > - [(set (match_operand:X87MODEF 0 "nonimmediate_operand") > - (match_operand:X87MODEF 1 "general_operand"))] > + [(set (match_operand:X87MODEFH 0 "nonimmediate_operand") > + (match_operand:X87MODEFH 1 "general_operand"))] > "" > "ix86_expand_move (<MODE>mode, operands); DONE;") > > @@ -3646,6 +3703,56 @@ (define_insn "*movsf_internal" > ] > (const_string "*")))]) > > +(define_insn "*movhf_internal" > + [(set (match_operand:HF 0 "nonimmediate_operand" > + "=?r,?m,v,v,v,m,?r,?v,r ,m") > + (match_operand:HF 1 "general_operand" > + "rmF,rF,C,v,m,v,v ,r ,rmF,rF"))] > + "TARGET_AVX512FP16 > + && !(MEM_P (operands[0]) && MEM_P (operands[1])) > + && (lra_in_progress > + || reload_completed > + || !CONST_DOUBLE_P (operands[1]) > + || standard_sse_constant_p (operands[1], HFmode) == 1 > + || memory_operand (operands[0], HFmode))" > +{ > + switch (get_attr_type (insn)) > + { > + case TYPE_IMOV: > + return "mov{w}\t{%1, %0|%0, %1}"; > + > + case TYPE_SSELOG1: > + return standard_sse_constant_opcode (insn, operands); > + > + case TYPE_SSEMOV: > + return ix86_output_ssemov (insn, operands); > + > + default: > + gcc_unreachable (); > + } > +} > + [(set (attr "type") > + (cond [(eq_attr "alternative" "0,1,8,9") > + (const_string "imov") > + (eq_attr "alternative" "2") > + (const_string "sselog1") > + ] > + (const_string "ssemov"))) > + (set (attr "prefix") > + (cond [(eq_attr "alternative" "0,1,8,9") > + (const_string "orig") > + (eq_attr "alternative" "2") > + (const_string "maybe_evex") > + ] > + (const_string "evex"))) > + (set (attr "mode") > + (cond [(eq_attr "alternative" "0,1,6,7,8,9") > + (const_string "HI") > + (eq_attr "alternative" "2") > + (const_string "V4SF") > + ] > + (const_string "HF")))]) > + > (define_split > [(set (match_operand 0 "any_fp_register_operand") > (match_operand 1 "memory_operand"))] > @@ -4383,6 +4490,17 @@ (define_split > emit_move_insn (operands[0], CONST0_RTX (V2DFmode)); > }) > > +(define_insn "extendhf<mode>2" > + [(set (match_operand:MODEF 0 "nonimm_ssenomem_operand" "=v") > + (float_extend:MODEF > + (match_operand:HF 1 "nonimmediate_operand" "vm")))] > + "TARGET_AVX512FP16" > + "vcvtsh2<ssemodesuffix>\t{%1, %0, %0|%0, %0, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "<MODE>")]) > + > + > (define_expand "extend<mode>xf2" > [(set (match_operand:XF 0 "nonimmediate_operand") > (float_extend:XF (match_operand:MODEF 1 "general_operand")))] > @@ -4560,6 +4678,18 @@ (define_insn "truncxf<mode>2" > (symbol_ref "flag_unsafe_math_optimizations") > ] > (symbol_ref "true")))]) > + > +;; Conversion from {SF,DF}mode to HFmode. > + > +(define_insn "trunc<mode>hf2" > + [(set (match_operand:HF 0 "register_operand" "=v") > + (float_truncate:HF > + (match_operand:MODEF 1 "register_ssemem_operand" "vm")))] > + "TARGET_AVX512FP16" > + "vcvt<ssemodesuffix>2sh\t{%1, %d0|%d0, %1}" > + [(set_attr "type" "ssecvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "HF")]) > > ;; Signed conversion to DImode. > > @@ -4936,6 +5066,16 @@ (define_insn "*float<SWI48:mode><MODEF:mode>2" > (symbol_ref "TARGET_INTER_UNIT_CONVERSIONS")] > (symbol_ref "true")))]) > > +(define_insn "float<floatunssuffix><mode>hf2" > + [(set (match_operand:HF 0 "register_operand" "=v") > + (any_float:HF > + (match_operand:SWI48 1 "nonimmediate_operand" "rm")))] > + "TARGET_AVX512FP16" > + "vcvt<floatsuffix>si2sh<rex64suffix>\t{%1, %d0|%d0, %1}" > + [(set_attr "type" "sseicvt") > + (set_attr "prefix" "evex") > + (set_attr "mode" "HF")]) > + > (define_insn "*floatdi<MODEF:mode>2_i387" > [(set (match_operand:MODEF 0 "register_operand" "=f") > (float:MODEF (match_operand:DI 1 "nonimmediate_operand" "m")))] > @@ -7517,10 +7657,10 @@ (define_expand "<insn>xf3" > "TARGET_80387") > > (define_expand "<insn><mode>3" > - [(set (match_operand:MODEF 0 "register_operand") > - (plusminus:MODEF > - (match_operand:MODEF 1 "register_operand") > - (match_operand:MODEF 2 "nonimmediate_operand")))] > + [(set (match_operand:MODEFH 0 "register_operand") > + (plusminus:MODEFH > + (match_operand:MODEFH 1 "register_operand") > + (match_operand:MODEFH 2 "nonimmediate_operand")))] > "(TARGET_80387 && X87_ENABLE_ARITH (<MODE>mode)) > || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)") > > @@ -8094,9 +8234,9 @@ (define_expand "mulxf3" > "TARGET_80387") > > (define_expand "mul<mode>3" > - [(set (match_operand:MODEF 0 "register_operand") > - (mult:MODEF (match_operand:MODEF 1 "register_operand") > - (match_operand:MODEF 2 "nonimmediate_operand")))] > + [(set (match_operand:MODEFH 0 "register_operand") > + (mult:MODEFH (match_operand:MODEFH 1 "register_operand") > + (match_operand:MODEFH 2 "nonimmediate_operand")))] > "(TARGET_80387 && X87_ENABLE_ARITH (<MODE>mode)) > || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)") > > @@ -8111,9 +8251,9 @@ (define_expand "divxf3" > "TARGET_80387") > > (define_expand "div<mode>3" > - [(set (match_operand:MODEF 0 "register_operand") > - (div:MODEF (match_operand:MODEF 1 "register_operand") > - (match_operand:MODEF 2 "nonimmediate_operand")))] > + [(set (match_operand:MODEFH 0 "register_operand") > + (div:MODEFH (match_operand:MODEFH 1 "register_operand") > + (match_operand:MODEFH 2 "nonimmediate_operand")))] > "(TARGET_80387 && X87_ENABLE_ARITH (<MODE>mode)) > || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)" > { > @@ -16105,6 +16245,22 @@ (define_insn "*fop_<mode>_comm" > (symbol_ref "true") > (symbol_ref "false"))))]) > > +(define_insn "*fop_hf_comm" > + [(set (match_operand:HF 0 "register_operand" "=v") > + (match_operator:HF 3 "binary_fp_operator" > + [(match_operand:HF 1 "nonimmediate_operand" "%v") > + (match_operand:HF 2 "nonimmediate_operand" "vm")]))] > + "TARGET_AVX512FP16 > + && COMMUTATIVE_ARITH_P (operands[3]) > + && !(MEM_P (operands[1]) && MEM_P (operands[2]))" > + "* return output_387_binary_op (insn, operands);" > + [(set (attr "type") > + (if_then_else (match_operand:HF 3 "mult_operator") > + (const_string "ssemul") > + (const_string "sseadd"))) > + (set_attr "prefix" "evex") > + (set_attr "mode" "HF")]) > + > (define_insn "*rcpsf2_sse" > [(set (match_operand:SF 0 "register_operand" "=x,x,x") > (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m")] > @@ -16178,6 +16334,22 @@ (define_insn "*fop_<mode>_1" > (symbol_ref "true") > (symbol_ref "false"))))]) > > +(define_insn "*fop_hf_1" > + [(set (match_operand:HF 0 "register_operand" "=v") > + (match_operator:HF 3 "binary_fp_operator" > + [(match_operand:HF 1 "nonimmediate_operand" "v") > + (match_operand:HF 2 "nonimmediate_operand" "vm")]))] > + "TARGET_AVX512FP16 > + && !COMMUTATIVE_ARITH_P (operands[3]) > + && !(MEM_P (operands[1]) && MEM_P (operands[2]))" > + "* return output_387_binary_op (insn, operands);" > + [(set (attr "type") > + (if_then_else (match_operand:MODEF 3 "div_operator") > + (const_string "ssediv") > + (const_string "sseadd"))) > + (set_attr "prefix" "evex") > + (set_attr "mode" "<MODE>")]) > + > (define_insn "*fop_<X87MODEF:mode>_2_i387" > [(set (match_operand:X87MODEF 0 "register_operand" "=f") > (match_operator:X87MODEF 3 "binary_fp_operator" > @@ -18972,11 +19144,11 @@ (define_peephole2 > }) > > (define_expand "mov<mode>cc" > - [(set (match_operand:X87MODEF 0 "register_operand") > - (if_then_else:X87MODEF > + [(set (match_operand:X87MODEFH 0 "register_operand") > + (if_then_else:X87MODEFH > (match_operand 1 "comparison_operator") > - (match_operand:X87MODEF 2 "register_operand") > - (match_operand:X87MODEF 3 "register_operand")))] > + (match_operand:X87MODEFH 2 "register_operand") > + (match_operand:X87MODEFH 3 "register_operand")))] > "(TARGET_80387 && TARGET_CMOVE) > || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)" > "if (ix86_expand_fp_movcc (operands)) DONE; else FAIL;") > @@ -19140,10 +19312,10 @@ (define_insn "<code><mode>3" > ;; presence of -0.0 and NaN. > > (define_insn "*ieee_s<ieee_maxmin><mode>3" > - [(set (match_operand:MODEF 0 "register_operand" "=x,v") > - (unspec:MODEF > - [(match_operand:MODEF 1 "register_operand" "0,v") > - (match_operand:MODEF 2 "nonimmediate_operand" "xm,vm")] > + [(set (match_operand:MODEFH 0 "register_operand" "=x,v") > + (unspec:MODEFH > + [(match_operand:MODEFH 1 "register_operand" "0,v") > + (match_operand:MODEFH 2 "nonimmediate_operand" "xm,vm")] > IEEE_MAXMIN))] > "SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH" > "@ > diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt > index 7b8547bb1c3..ad366974b5b 100644 > --- a/gcc/config/i386/i386.opt > +++ b/gcc/config/i386/i386.opt > @@ -1166,3 +1166,7 @@ Emit GNU_PROPERTY_X86_ISA_1_NEEDED GNU property. > mmwait > Target Mask(ISA2_MWAIT) Var(ix86_isa_flags2) Save > Support MWAIT and MONITOR built-in functions and code generation. > + > +mavx512fp16 > +Target Mask(ISA2_AVX512FP16) Var(ix86_isa_flags2) Save > +Support MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX512F and > AVX512FP16 built-in functions and code generation. > diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h > index f129de4bbe5..5344e22c9c8 100644 > --- a/gcc/config/i386/immintrin.h > +++ b/gcc/config/i386/immintrin.h > @@ -94,6 +94,8 @@ > > #include <avx512vp2intersectvlintrin.h> > > +#include <avx512fp16intrin.h> > + > #include <shaintrin.h> > > #include <fmaintrin.h> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index ffcc0c81964..446f9ba552f 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -321,6 +321,11 @@ (define_mode_iterator VF2_512_256VL > (define_mode_iterator VF_128 > [V4SF (V2DF "TARGET_SSE2")]) > > +;; All 128bit vector HF/SF/DF modes > +(define_mode_iterator VFH_128 > + [(V8HF "TARGET_AVX512FP16") > + V4SF (V2DF "TARGET_SSE2")]) > + > ;; All 256bit vector float modes > (define_mode_iterator VF_256 > [V8SF V4DF]) > @@ -730,8 +735,10 @@ (define_mode_attr avx512bcst > > ;; Mapping from float mode to required SSE level > (define_mode_attr sse > - [(SF "sse") (DF "sse2") > + [(SF "sse") (DF "sse2") (HF "avx512fp16") > (V4SF "sse") (V2DF "sse2") > + (V32HF "avx512fp16") (V16HF "avx512fp16") > + (V8HF "avx512fp16") > (V16SF "avx512f") (V8SF "avx") > (V8DF "avx512f") (V4DF "avx")]) > > @@ -869,6 +876,7 @@ (define_mode_attr ssescalarmode > (V32HI "HI") (V16HI "HI") (V8HI "HI") > (V16SI "SI") (V8SI "SI") (V4SI "SI") > (V8DI "DI") (V4DI "DI") (V2DI "DI") > + (V32HF "HF") (V16HF "HF") (V8HF "HF") > (V16SF "SF") (V8SF "SF") (V4SF "SF") > (V8DF "DF") (V4DF "DF") (V2DF "DF") > (V4TI "TI") (V2TI "TI")]) > @@ -948,10 +956,10 @@ (define_mode_attr sseintprefix > > ;; SSE scalar suffix for vector modes > (define_mode_attr ssescalarmodesuffix > - [(SF "ss") (DF "sd") > - (V16SF "ss") (V8DF "sd") > - (V8SF "ss") (V4DF "sd") > - (V4SF "ss") (V2DF "sd") > + [(HF "sh") (SF "ss") (DF "sd") > + (V32HF "sh") (V16SF "ss") (V8DF "sd") > + (V16HF "sh") (V8SF "ss") (V4DF "sd") > + (V8HF "sh") (V4SF "ss") (V2DF "sd") > (V16SI "d") (V8DI "q") > (V8SI "d") (V4DI "q") > (V4SI "d") (V2DI "q")]) > @@ -1903,12 +1911,12 @@ (define_insn "*<insn><mode>3<mask_name><round_name>" > ;; Standard scalar operation patterns which preserve the rest of the > ;; vector for combiner. > (define_insn "*<sse>_vm<insn><mode>3" > - [(set (match_operand:VF_128 0 "register_operand" "=x,v") > - (vec_merge:VF_128 > - (vec_duplicate:VF_128 > + [(set (match_operand:VFH_128 0 "register_operand" "=x,v") > + (vec_merge:VFH_128 > + (vec_duplicate:VFH_128 > (plusminus:<ssescalarmode> > (vec_select:<ssescalarmode> > - (match_operand:VF_128 1 "register_operand" "0,v") > + (match_operand:VFH_128 1 "register_operand" "0,v") > (parallel [(const_int 0)])) > (match_operand:<ssescalarmode> 2 "nonimmediate_operand" > "xm,vm"))) > (match_dup 1) > @@ -1966,12 +1974,12 @@ (define_insn "*mul<mode>3<mask_name><round_name>" > ;; Standard scalar operation patterns which preserve the rest of the > ;; vector for combiner. > (define_insn "*<sse>_vm<multdiv_mnemonic><mode>3" > - [(set (match_operand:VF_128 0 "register_operand" "=x,v") > - (vec_merge:VF_128 > - (vec_duplicate:VF_128 > + [(set (match_operand:VFH_128 0 "register_operand" "=x,v") > + (vec_merge:VFH_128 > + (vec_duplicate:VFH_128 > (multdiv:<ssescalarmode> > (vec_select:<ssescalarmode> > - (match_operand:VF_128 1 "register_operand" "0,v") > + (match_operand:VFH_128 1 "register_operand" "0,v") > (parallel [(const_int 0)])) > (match_operand:<ssescalarmode> 2 "nonimmediate_operand" > "xm,vm"))) > (match_dup 1) > @@ -2368,12 +2376,12 @@ (define_insn > "ieee_<ieee_maxmin><mode>3<mask_name><round_saeonly_name>" > ;; Standard scalar operation patterns which preserve the rest of the > ;; vector for combiner. > (define_insn "*ieee_<ieee_maxmin><mode>3" > - [(set (match_operand:VF_128 0 "register_operand" "=x,v") > - (vec_merge:VF_128 > - (vec_duplicate:VF_128 > + [(set (match_operand:VFH_128 0 "register_operand" "=x,v") > + (vec_merge:VFH_128 > + (vec_duplicate:VFH_128 > (unspec:<ssescalarmode> > [(vec_select:<ssescalarmode> > - (match_operand:VF_128 1 "register_operand" "0,v") > + (match_operand:VFH_128 1 "register_operand" "0,v") > (parallel [(const_int 0)])) > (match_operand:<ssescalarmode> 2 "nonimmediate_operand" > "xm,vm")] > IEEE_MAXMIN)) > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 2dc6a2106d9..3e1b1dbd606 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -1392,6 +1392,7 @@ See RS/6000 and PowerPC Options. > -mavx5124fmaps -mavx512vnni -mavx5124vnniw -mprfchw -mrdpid @gol > -mrdseed -msgx -mavx512vp2intersect -mserialize -mtsxldtrk@gol > -mamx-tile -mamx-int8 -mamx-bf16 -muintr -mhreset -mavxvnni@gol > +-mavx512fp16 @gol > -mcldemote -mms-bitfields -mno-align-stringops -minline-all-stringops @gol > -minline-stringops-dynamically -mstringop-strategy=@var{alg} @gol > -mkl -mwidekl @gol > @@ -31059,6 +31060,9 @@ preferred alignment to > @option{-mpreferred-stack-boundary=2}. > @itemx -mavx512bf16 > @opindex mavx512bf16 > @need 200 > +@itemx -mavx512fp16 > +@opindex mavx512fp16 > +@need 200 > @itemx -mgfni > @opindex mgfni > @need 200 > @@ -31137,9 +31141,9 @@ WBNOINVD, FMA4, PREFETCHW, RDPID, PREFETCHWT1, > RDSEED, SGX, XOP, LWP, > XSAVEOPT, XSAVEC, XSAVES, RTM, HLE, TBM, MWAITX, CLZERO, PKU, AVX512VBMI2, > GFNI, VAES, WAITPKG, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B, > AVX512BF16, > ENQCMD, AVX512VPOPCNTDQ, AVX5124FMAPS, AVX512VNNI, AVX5124VNNIW, SERIALIZE, > -UINTR, HRESET, AMXTILE, AMXINT8, AMXBF16, KL, WIDEKL, AVXVNNI or CLDEMOTE > -extended instruction sets. Each has a corresponding @option{-mno-} option to > -disable use of these instructions. > +UINTR, HRESET, AMXTILE, AMXINT8, AMXBF16, KL, WIDEKL, AVXVNNI, AVX512FP16 > +or CLDEMOTE extended instruction sets. Each has a corresponding > +@option{-mno-} option to disable use of these instructions. > > These extensions are also available as built-in functions: see > @ref{x86 Built-in Functions}, for details of the functions enabled and > diff --git a/gcc/testsuite/g++.target/i386/float16-1.C > b/gcc/testsuite/g++.target/i386/float16-1.C > new file mode 100644 > index 00000000000..8f07e85d184 > --- /dev/null > +++ b/gcc/testsuite/g++.target/i386/float16-1.C > @@ -0,0 +1,8 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mno-avx512fp16" } */ > + > +_Float16 > +foo (_Float16 x) /* { dg-error "is not supported on this > target\[\n\r]*compilation terminated" } */ > +{ > + return x; > +} > diff --git a/gcc/testsuite/g++.target/i386/float16-2.C > b/gcc/testsuite/g++.target/i386/float16-2.C > new file mode 100644 > index 00000000000..99eb797eff1 > --- /dev/null > +++ b/gcc/testsuite/g++.target/i386/float16-2.C > @@ -0,0 +1,14 @@ > +/* { dg-do assemble { target avx512fp16 } } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +union flt > +{ > + _Float16 flt; > + short s; > +}; > + > +_Float16 > +foo (union flt x) > +{ > + return x.flt; > +} > diff --git a/gcc/testsuite/g++.target/i386/float16-3.C > b/gcc/testsuite/g++.target/i386/float16-3.C > new file mode 100644 > index 00000000000..940878503f1 > --- /dev/null > +++ b/gcc/testsuite/g++.target/i386/float16-3.C > @@ -0,0 +1,10 @@ > +/* { dg-do assemble { target avx512fp16 } } */ > +/* { dg-options "-O0 -mavx512fp16" } */ > + > +template <typename> void a(char *) {} > +char b, d; > +void c() > +{ > + a<unsigned char>(&d); > + a<_Float16>(&b); > +} > diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c > b/gcc/testsuite/gcc.target/i386/avx-1.c > index 6178e38ce02..f3676077743 100644 > --- a/gcc/testsuite/gcc.target/i386/avx-1.c > +++ b/gcc/testsuite/gcc.target/i386/avx-1.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -m3dnow > -mavx -mavx2 -maes -mpclmul -mgfni -mavx512bw" } */ > +/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -m3dnow > -mavx -mavx2 -maes -mpclmul -mgfni -mavx512bw -mavx512fp16" } */ > /* { dg-add-options bind_pic_locally } */ > > #include <mm_malloc.h> > diff --git a/gcc/testsuite/gcc.target/i386/avx-2.c > b/gcc/testsuite/gcc.target/i386/avx-2.c > index 986fbd819e4..1751c52565c 100644 > --- a/gcc/testsuite/gcc.target/i386/avx-2.c > +++ b/gcc/testsuite/gcc.target/i386/avx-2.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -m3dnow > -mavx -mavx2 -msse4a -maes -mpclmul -mavx512bw" } */ > +/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -m3dnow > -mavx -mavx2 -msse4a -maes -mpclmul -mavx512bw -mavx512fp16" } */ > /* { dg-add-options bind_pic_locally } */ > > #include <mm_malloc.h> > diff --git a/gcc/testsuite/gcc.target/i386/avx512-check.h > b/gcc/testsuite/gcc.target/i386/avx512-check.h > index 0a377dba1d5..0ad9064f637 100644 > --- a/gcc/testsuite/gcc.target/i386/avx512-check.h > +++ b/gcc/testsuite/gcc.target/i386/avx512-check.h > @@ -87,6 +87,9 @@ main () > #ifdef AVX512VNNI > && (ecx & bit_AVX512VNNI) > #endif > +#ifdef AVX512FP16 > + && (edx & bit_AVX512FP16) > +#endif > #ifdef VAES > && (ecx & bit_VAES) > #endif > diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-12a.c > b/gcc/testsuite/gcc.target/i386/avx512fp16-12a.c > new file mode 100644 > index 00000000000..88887556d68 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-12a.c > @@ -0,0 +1,21 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +_Float16 > +__attribute__ ((noinline, noclone)) > +do_max (_Float16 __A, _Float16 __B) > +{ > + return __A > __B ? __A : __B; > +} > + > +_Float16 > +__attribute__ ((noinline, noclone)) > +do_min (_Float16 __A, _Float16 __B) > +{ > + return __A < __B ? __A : __B; > +} > + > +/* { dg-final { scan-assembler-times "vmaxsh\[ \\t\]" 1 } } */ > +/* { dg-final { scan-assembler-times "vminsh\[ \\t\]" 1 } } */ > +/* { dg-final { scan-assembler-not "vmovsh\[ \\t\]" { target { ! ia32 } } } > } */ > +/* { dg-final { scan-assembler-not "vcomish\[ \\t\]" } } */ > diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-12b.c > b/gcc/testsuite/gcc.target/i386/avx512fp16-12b.c > new file mode 100644 > index 00000000000..c9e23bf95c2 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-12b.c > @@ -0,0 +1,27 @@ > +/* { dg-do run { target avx512fp16 } } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +#include <string.h> > + > +static void do_test (void); > + > +#define DO_TEST do_test > +#define AVX512FP16 > +#include "avx512-check.h" > +#include "avx512fp16-12a.c" > + > +static void > +do_test (void) > +{ > + _Float16 x = 0.1f; > + _Float16 y = -3.2f; > + _Float16 z; > + > + z = do_max (x, y); > + if (z != x) > + abort (); > + > + z = do_min (x, y); > + if (z != y) > + abort (); > +} > diff --git a/gcc/testsuite/gcc.target/i386/float16-1.c > b/gcc/testsuite/gcc.target/i386/float16-1.c > new file mode 100644 > index 00000000000..8f07e85d184 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/float16-1.c > @@ -0,0 +1,8 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mno-avx512fp16" } */ > + > +_Float16 > +foo (_Float16 x) /* { dg-error "is not supported on this > target\[\n\r]*compilation terminated" } */ > +{ > + return x; > +} > diff --git a/gcc/testsuite/gcc.target/i386/float16-2.c > b/gcc/testsuite/gcc.target/i386/float16-2.c > new file mode 100644 > index 00000000000..99eb797eff1 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/float16-2.c > @@ -0,0 +1,14 @@ > +/* { dg-do assemble { target avx512fp16 } } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +union flt > +{ > + _Float16 flt; > + short s; > +}; > + > +_Float16 > +foo (union flt x) > +{ > + return x.flt; > +} > diff --git a/gcc/testsuite/gcc.target/i386/float16-3a.c > b/gcc/testsuite/gcc.target/i386/float16-3a.c > new file mode 100644 > index 00000000000..3846c8e9b6e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/float16-3a.c > @@ -0,0 +1,10 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +_Float16 > +foo (int x) > +{ > + return x; > +} > + > +/* { dg-final { scan-assembler-times "vcvtsi2shl\[ \t\]+\[^\n\r]*%xmm0" 1 } > } */ > diff --git a/gcc/testsuite/gcc.target/i386/float16-3b.c > b/gcc/testsuite/gcc.target/i386/float16-3b.c > new file mode 100644 > index 00000000000..247dd6e7e33 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/float16-3b.c > @@ -0,0 +1,10 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +_Float16 > +foo (unsigned int x) > +{ > + return x; > +} > + > +/* { dg-final { scan-assembler-times "vcvtusi2shl\[ \t\]+\[^\n\r]*%xmm0" 1 } > } */ > diff --git a/gcc/testsuite/gcc.target/i386/float16-4a.c > b/gcc/testsuite/gcc.target/i386/float16-4a.c > new file mode 100644 > index 00000000000..631082581f3 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/float16-4a.c > @@ -0,0 +1,10 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +_Float16 > +foo (long long x) > +{ > + return x; > +} > + > +/* { dg-final { scan-assembler-times "vcvtsi2shq\[ \t\]+\[^\n\r]*%xmm0" 1 } > } */ > diff --git a/gcc/testsuite/gcc.target/i386/float16-4b.c > b/gcc/testsuite/gcc.target/i386/float16-4b.c > new file mode 100644 > index 00000000000..828d8530769 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/float16-4b.c > @@ -0,0 +1,10 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > + > +_Float16 > +foo (unsigned long long x) > +{ > + return x; > +} > + > +/* { dg-final { scan-assembler-times "vcvtusi2shq\[ \t\]+\[^\n\r]*%xmm0" 1 } > } */ > diff --git a/gcc/testsuite/gcc.target/i386/funcspec-56.inc > b/gcc/testsuite/gcc.target/i386/funcspec-56.inc > index 79265c7c94f..8499fdf2db9 100644 > --- a/gcc/testsuite/gcc.target/i386/funcspec-56.inc > +++ b/gcc/testsuite/gcc.target/i386/funcspec-56.inc > @@ -79,6 +79,7 @@ extern void test_hreset (void) > __attribute__((__target__("hreset"))); > extern void test_keylocker (void) > __attribute__((__target__("kl"))); > extern void test_widekl (void) > __attribute__((__target__("widekl"))); > extern void test_avxvnni (void) > __attribute__((__target__("avxvnni"))); > +extern void test_avx512fp16 (void) > __attribute__((__target__("avx512fp16"))); > > extern void test_no_sgx (void) > __attribute__((__target__("no-sgx"))); > extern void test_no_avx5124fmaps(void) > __attribute__((__target__("no-avx5124fmaps"))); > @@ -159,6 +160,7 @@ extern void test_no_hreset (void) > __attribute__((__target__("no-hreset"))); > extern void test_no_keylocker (void) > __attribute__((__target__("no-kl"))); > extern void test_no_widekl (void) > __attribute__((__target__("no-widekl"))); > extern void test_no_avxvnni (void) > __attribute__((__target__("no-avxvnni"))); > +extern void test_no_avx512fp16 (void) > __attribute__((__target__("no-avx512fp16"))); > > extern void test_arch_nocona (void) > __attribute__((__target__("arch=nocona"))); > extern void test_arch_core2 (void) > __attribute__((__target__("arch=core2"))); > diff --git a/gcc/testsuite/gcc.target/i386/pr54855-12.c > b/gcc/testsuite/gcc.target/i386/pr54855-12.c > new file mode 100644 > index 00000000000..87b4f459a5a > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr54855-12.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -mavx512fp16" } */ > +/* { dg-final { scan-assembler-times "vmaxsh\[ \\t\]" 1 } } */ > +/* { dg-final { scan-assembler-not "vcomish\[ \\t\]" } } */ > +/* { dg-final { scan-assembler-not "vmovsh\[ \\t\]" { target { ! ia32 } } } > } */ > + > +#include <immintrin.h> > + > +__m128h > +foo (__m128h x, __m128h y) > +{ > + x[0] = x[0] > y[0] ? x[0] : y[0]; > + return x; > +} > diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c > b/gcc/testsuite/gcc.target/i386/sse-13.c > index 7029771334b..f5f5c113612 100644 > --- a/gcc/testsuite/gcc.target/i386/sse-13.c > +++ b/gcc/testsuite/gcc.target/i386/sse-13.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a > -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi > -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw > -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha > -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw > -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124vnniw > -mavx512vpopcntdq -mavx512vp2intersect -mclwb -mmwaitx -mclzero -mpku -msgx > -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd > -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl > -mavxvnni" } */ > +/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse4a > -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi > -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw > -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha > -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512vl -mavx512dq -mavx512bw > -mavx512vbmi -mavx512vbmi2 -mavx512ifma -mavx5124fmaps -mavx5124vnniw > -mavx512vpopcntdq -mavx512vp2intersect -mclwb -mmwaitx -mclzero -mpku -msgx > -mrdpid -mgfni -mavx512bitalg -mpconfig -mwbnoinvd -mavx512bf16 -menqcmd > -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl > -mavxvnni -mavx512fp16" } */ > /* { dg-add-options bind_pic_locally } */ > > #include <mm_malloc.h> > diff --git a/gcc/testsuite/gcc.target/i386/sse-14.c > b/gcc/testsuite/gcc.target/i386/sse-14.c > index 4ce0ffffaf3..747d504cedb 100644 > --- a/gcc/testsuite/gcc.target/i386/sse-14.c > +++ b/gcc/testsuite/gcc.target/i386/sse-14.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a > -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi > -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw > -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha > -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl > -mavx512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw > -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni > -mpconfig -mwbnoinvd -mavx512vl -mavx512bf16 -menqcmd -mavx512vp2intersect > -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl > -mavxvnni" } */ > +/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -msse4a > -m3dnow -mavx -mavx2 -mfma4 -mxop -maes -mpclmul -mpopcnt -mabm -mlzcnt -mbmi > -mbmi2 -mtbm -mlwp -mfsgsbase -mrdrnd -mf16c -mfma -mrtm -mrdseed -mprfchw > -madx -mfxsr -mxsaveopt -mavx512f -mavx512er -mavx512cd -mavx512pf -msha > -mprefetchwt1 -mxsavec -mxsaves -mclflushopt -mavx512dq -mavx512bw -mavx512vl > -mavx512ifma -mavx512vbmi -mavx512vbmi2 -mavx5124fmaps -mavx5124vnniw > -mavx512vpopcntdq -mclwb -mmwaitx -mclzero -mpku -msgx -mrdpid -mgfni > -mpconfig -mwbnoinvd -mavx512vl -mavx512bf16 -menqcmd -mavx512vp2intersect > -mserialize -mtsxldtrk -mamx-tile -mamx-int8 -mamx-bf16 -mkl -mwidekl > -mavxvnni -mavx512fp16" } */ > /* { dg-add-options bind_pic_locally } */ > > #include <mm_malloc.h> > diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c > b/gcc/testsuite/gcc.target/i386/sse-22.c > index 6e8b6f3fa1b..33411969901 100644 > --- a/gcc/testsuite/gcc.target/i386/sse-22.c > +++ b/gcc/testsuite/gcc.target/i386/sse-22.c > @@ -103,7 +103,7 @@ > > > #ifndef DIFFERENT_PRAGMAS > -#pragma GCC target > ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512vbmi2,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni,avx512bitalg,avx512bf16,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni") > +#pragma GCC target > ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,avx512vl,avx512bw,avx512dq,avx512vbmi,avx512vbmi2,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni,avx512bitalg,avx512bf16,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512fp16") > #endif > > /* Following intrinsics require immediate arguments. They > @@ -220,7 +220,7 @@ test_4 (_mm_cmpestrz, int, __m128i, int, __m128i, int, 1) > > /* immintrin.h (AVX/AVX2/RDRND/FSGSBASE/F16C/RTM/AVX512F/SHA) */ > #ifdef DIFFERENT_PRAGMAS > -#pragma GCC target > ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx512vbmi2,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni,avx512bitalg,avx512bf16,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni") > +#pragma GCC target > ("avx,avx2,rdrnd,fsgsbase,f16c,rtm,avx512f,avx512er,avx512cd,avx512pf,sha,avx512vl,avx512bw,avx512dq,avx512ifma,avx512vbmi,avx512vbmi2,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,gfni,avx512bitalg,avx512bf16,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512fp16") > #endif > #include <immintrin.h> > test_1 (_cvtss_sh, unsigned short, float, 1) > diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c > b/gcc/testsuite/gcc.target/i386/sse-23.c > index 7faa053ace8..86590ca5ffb 100644 > --- a/gcc/testsuite/gcc.target/i386/sse-23.c > +++ b/gcc/testsuite/gcc.target/i386/sse-23.c > @@ -708,6 +708,6 @@ > #define __builtin_ia32_vpclmulqdq_v2di(A, B, C) > __builtin_ia32_vpclmulqdq_v2di(A, B, 1) > #define __builtin_ia32_vpclmulqdq_v8di(A, B, C) > __builtin_ia32_vpclmulqdq_v8di(A, B, 1) > > -#pragma GCC target > ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku,sgx,rdpid,gfni,avx512vbmi2,vpclmulqdq,avx512bitalg,pconfig,wbnoinvd,avx512bf16,enqcmd,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni") > +#pragma GCC target > ("sse4a,3dnow,avx,avx2,fma4,xop,aes,pclmul,popcnt,abm,lzcnt,bmi,bmi2,tbm,lwp,fsgsbase,rdrnd,f16c,fma,rtm,rdseed,prfchw,adx,fxsr,xsaveopt,avx512f,avx512er,avx512cd,avx512pf,sha,prefetchwt1,xsavec,xsaves,clflushopt,avx512bw,avx512dq,avx512vl,avx512vbmi,avx512ifma,avx5124fmaps,avx5124vnniw,avx512vpopcntdq,clwb,mwaitx,clzero,pku,sgx,rdpid,gfni,avx512vbmi2,vpclmulqdq,avx512bitalg,pconfig,wbnoinvd,avx512bf16,enqcmd,avx512vp2intersect,serialize,tsxldtrk,amx-tile,amx-int8,amx-bf16,kl,widekl,avxvnni,avx512fp16") > > #include <x86intrin.h> > diff --git a/gcc/testsuite/lib/target-supports.exp > b/gcc/testsuite/lib/target-supports.exp > index 7f78c5593ac..3a7f19ca8a7 100644 > --- a/gcc/testsuite/lib/target-supports.exp > +++ b/gcc/testsuite/lib/target-supports.exp > @@ -3020,7 +3020,7 @@ proc check_effective_target_has_q_floating_suffix { } { > > proc check_effective_target_float16 {} { > return [check_no_compiler_messages_nocache float16 object { > - _Float16 x; > + _Float16 foo (_Float16 x) { return x; } > } [add_options_for_float16 ""]] > } > > @@ -8654,6 +8654,17 @@ proc check_prefer_avx128 { } { > } > > > +# Return 1 if avx512fp16 instructions can be compiled. > + > +proc check_effective_target_avx512fp16 { } { > + return [check_no_compiler_messages avx512fp16 object { > + void foo (void) > + { > + asm volatile ("vmovw %di, %xmm0"); > + } > + } "-O2 -mavx512fp16" ] > +} > + > # Return 1 if avx512f instructions can be compiled. > > proc check_effective_target_avx512f { } { > -- > 2.18.1 >
-- BR, Hongtao