https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89229
H.J. Lu <hjl.tools at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #45707|0 |1 is obsolete| | --- Comment #24 from H.J. Lu <hjl.tools at gmail dot com> --- Comment on attachment 45707 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45707 A new patch >From fd7220a7551ee774614ca89574241813aae153b7 Mon Sep 17 00:00:00 2001 >From: "H.J. Lu" <hjl.to...@gmail.com> >Date: Tue, 12 Feb 2019 13:25:41 -0800 >Subject: [PATCH] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move > >i386 backend has > >INT_MODE (OI, 32); >INT_MODE (XI, 64); > >So, XI_MODE represents 64 INTEGER bytes = 64 * 8 = 512 bit operation, >in case of const_1, all 512 bits set. > >We can load zeros with narrower instruction, (e.g. 256 bit by inherent >zeroing of highpart in case of 128 bit xor), so TImode in this case. > >Some targets prefer V4SF mode, so they will emit float xorps for zeroing. > >sse.md has > >(define_insn "mov<mode>_internal" > [(set (match_operand:VMOVE 0 "nonimmediate_operand" > "=v,v ,v ,m") > (match_operand:VMOVE 1 "nonimmediate_or_sse_const_operand" > " C,BC,vm,v"))] >.... > /* There is no evex-encoded vmov* for sizes smaller than 64-bytes > in avx512f, so we need to use workarounds, to access sse registers > 16-31, which are evex-only. In avx512vl we don't need workarounds. */ > if (TARGET_AVX512F && <MODE_SIZE> < 64 && !TARGET_AVX512VL > && (EXT_REX_SSE_REG_P (operands[0]) > || EXT_REX_SSE_REG_P (operands[1]))) > { > if (memory_operand (operands[0], <MODE>mode)) > { > if (<MODE_SIZE> == 32) > return "vextract<shuffletype>64x4\t{$0x0, %g1, %0|%0, %g1, > 0x0}"; > else if (<MODE_SIZE> == 16) > return "vextract<shuffletype>32x4\t{$0x0, %g1, %0|%0, %g1, > 0x0}"; > else > gcc_unreachable (); > } >... > >However, since ix86_hard_regno_mode_ok has > > /* TODO check for QI/HI scalars. */ > /* AVX512VL allows sse regs16+ for 128/256 bit modes. */ > if (TARGET_AVX512VL > && (mode == OImode > || mode == TImode > || VALID_AVX256_REG_MODE (mode) > || VALID_AVX512VL_128_REG_MODE (mode))) > return true; > > /* xmm16-xmm31 are only available for AVX-512. */ > if (EXT_REX_SSE_REGNO_P (regno)) > return false; > > if (TARGET_AVX512F && <MODE_SIZE> < 64 && !TARGET_AVX512VL > && (EXT_REX_SSE_REG_P (operands[0]) > || EXT_REX_SSE_REG_P (operands[1]))) > >is a dead code. > >All TYPE_SSEMOV vector moves are consolidated to ix86_output_ssemov: > >1. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE/AVX vector >moves will be generated. >2. If xmm16-xmm31/ymm16-ymm31 registers are used: > a. With AVX512VL, AVX512VL vector moves will be generated. > b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register > move will be done with zmm register move. > >ext_sse_reg_operand is removed since it is no longer needed. > >gcc/ > > PR target/89229 > * config/i386/i386-protos.h (ix86_output_ssemov): New prototype. > * config/i386/i386.c (ix86_get_ssemov): New function. > (ix86_output_ssemov): Likewise. > * config/i386/i386.md (*movxi_internal_avx512f): Call > ix86_output_ssemov for TYPE_SSEMOV. > (*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV. > Remove ext_sse_reg_operand and TARGET_AVX512VL check. > (*movti_internal): Likewise. > (*movdi_internal): Call ix86_output_ssemov for TYPE_SSEMOV. > Remove ext_sse_reg_operand check. > (*movsi_internal): Likewise. > (*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV. > (*movdf_internal): Call ix86_output_ssemov for TYPE_SSEMOV. > Remove TARGET_AVX512F, TARGET_PREFER_AVX256, TARGET_AVX512VL > and ext_sse_reg_operand check. > (*movsf_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV. > Remove TARGET_PREFER_AVX256, TARGET_AVX512VL and > ext_sse_reg_operand check. > * config/i386/mmx.md (MMXMODE:*mov<mode>_internal): Call > ix86_output_ssemov for TYPE_SSEMOV. Remove ext_sse_reg_operand > check. > * config/i386/sse.md (VMOVE:mov<mode>_internal): Call > ix86_output_ssemov for TYPE_SSEMOV. Remove TARGET_AVX512VL > check. > * config/i386/predicates.md (ext_sse_reg_operand): Removed. > >gcc/testsuite/ > > PR target/89229 > * gcc.target/i386/pr89229-2a.c: New test. > * gcc.target/i386/pr89229-2b.c: Likewise. > * gcc.target/i386/pr89229-2c.c: Likewise. > * gcc.target/i386/pr89229-3a.c: Likewise. > * gcc.target/i386/pr89229-3b.c: Likewise. > * gcc.target/i386/pr89229-3c.c: Likewise. > * gcc.target/i386/pr89229-4a.c: Likewise. > * gcc.target/i386/pr89229-4b.c: Likewise. > * gcc.target/i386/pr89229-4c.c: Likewise. > * gcc.target/i386/pr89229-5a.c: Likewise. > * gcc.target/i386/pr89229-5b.c: Likewise. > * gcc.target/i386/pr89229-5c.c: Likewise. > * gcc.target/i386/pr89229-6a.c: Likewise. > * gcc.target/i386/pr89229-6b.c: Likewise. > * gcc.target/i386/pr89229-6c.c: Likewise. > * gcc.target/i386/pr89229-7a.c: Likewise. > * gcc.target/i386/pr89229-7b.c: Likewise. > * gcc.target/i386/pr89229-7c.c: Likewise. >--- > gcc/config/i386/i386-protos.h | 2 + > gcc/config/i386/i386.c | 250 +++++++++++++++++++++ > gcc/config/i386/i386.md | 212 ++--------------- > gcc/config/i386/mmx.md | 29 +-- > gcc/config/i386/predicates.md | 5 - > gcc/config/i386/sse.md | 98 +------- > gcc/testsuite/gcc.target/i386/pr89229-2a.c | 15 ++ > gcc/testsuite/gcc.target/i386/pr89229-2b.c | 13 ++ > gcc/testsuite/gcc.target/i386/pr89229-2c.c | 6 + > gcc/testsuite/gcc.target/i386/pr89229-3a.c | 17 ++ > gcc/testsuite/gcc.target/i386/pr89229-3b.c | 6 + > gcc/testsuite/gcc.target/i386/pr89229-3c.c | 7 + > gcc/testsuite/gcc.target/i386/pr89229-4a.c | 17 ++ > gcc/testsuite/gcc.target/i386/pr89229-4b.c | 6 + > gcc/testsuite/gcc.target/i386/pr89229-4c.c | 7 + > gcc/testsuite/gcc.target/i386/pr89229-5a.c | 16 ++ > gcc/testsuite/gcc.target/i386/pr89229-5b.c | 6 + > gcc/testsuite/gcc.target/i386/pr89229-5c.c | 6 + > gcc/testsuite/gcc.target/i386/pr89229-6a.c | 16 ++ > gcc/testsuite/gcc.target/i386/pr89229-6b.c | 6 + > gcc/testsuite/gcc.target/i386/pr89229-6c.c | 6 + > gcc/testsuite/gcc.target/i386/pr89229-7a.c | 16 ++ > gcc/testsuite/gcc.target/i386/pr89229-7b.c | 12 + > gcc/testsuite/gcc.target/i386/pr89229-7c.c | 6 + > 24 files changed, 453 insertions(+), 327 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2a.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2b.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2c.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3a.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3b.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3c.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4a.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4b.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4c.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5a.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5b.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5c.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6a.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6b.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6c.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7a.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7b.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7c.c > >diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h >index 2d600173917..27f5cc13abf 100644 >--- a/gcc/config/i386/i386-protos.h >+++ b/gcc/config/i386/i386-protos.h >@@ -38,6 +38,8 @@ extern void ix86_expand_split_stack_prologue (void); > extern void ix86_output_addr_vec_elt (FILE *, int); > extern void ix86_output_addr_diff_elt (FILE *, int, int); > >+extern const char *ix86_output_ssemov (rtx_insn *, rtx *); >+ > extern enum calling_abi ix86_cfun_abi (void); > extern enum calling_abi ix86_function_type_abi (const_tree); > >diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c >index fd05873ba39..97d1ea4229e 100644 >--- a/gcc/config/i386/i386.c >+++ b/gcc/config/i386/i386.c >@@ -10281,6 +10281,256 @@ ix86_standard_x87sse_constant_load_p (const rtx_insn >*insn, rtx dst) > return true; > } > >+/* Return the opcode of the TYPE_SSEMOV instruction. To move from >+ or to xmm16-xmm31/ymm16-ymm31 registers, we either require >+ TARGET_AVX512VL or it is a register to register move which can >+ be done with zmm register move. */ >+ >+static const char * >+ix86_get_ssemov (rtx *operands, unsigned size, machine_mode mode) >+{ >+ static char buf[128]; >+ bool misaligned_p = (misaligned_operand (operands[0], mode) >+ || misaligned_operand (operands[1], mode)); >+ bool evex_reg_p = (EXT_REX_SSE_REG_P (operands[0]) >+ || EXT_REX_SSE_REG_P (operands[1])); >+ machine_mode scalar_mode = GET_MODE_INNER (mode); >+ >+ const char *opcode = NULL; >+ enum >+ { >+ opcode_int, >+ opcode_float, >+ opcode_double >+ } type = opcode_int; >+ if (SCALAR_FLOAT_MODE_P (scalar_mode)) >+ { >+ switch (scalar_mode) >+ { >+ case E_SFmode: >+ if (size == 64 || !evex_reg_p || TARGET_AVX512VL) >+ opcode = misaligned_p ? "%vmovups" : "%vmovaps"; >+ else >+ type = opcode_float; >+ break; >+ case E_DFmode: >+ if (size == 64 || !evex_reg_p || TARGET_AVX512VL) >+ opcode = misaligned_p ? "%vmovupd" : "%vmovapd"; >+ else >+ type = opcode_double; >+ break; >+ case E_TFmode: >+ if (size == 64) >+ opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; >+ else if (evex_reg_p) >+ { >+ if (TARGET_AVX512VL) >+ opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; >+ } >+ else >+ opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; >+ break; >+ default: >+ gcc_unreachable (); >+ } >+ } >+ else if (SCALAR_INT_MODE_P (scalar_mode)) >+ { >+ switch (scalar_mode) >+ { >+ case E_QImode: >+ if (size == 64) >+ opcode = (misaligned_p >+ ? (TARGET_AVX512BW >+ ? "vmovdqu8" >+ : "vmovdqu64") >+ : "vmovdqa64"); >+ else if (evex_reg_p) >+ { >+ if (TARGET_AVX512VL) >+ opcode = (misaligned_p >+ ? (TARGET_AVX512BW >+ ? "vmovdqu8" >+ : "vmovdqu64") >+ : "vmovdqa64"); >+ } >+ else >+ opcode = (misaligned_p >+ ? (TARGET_AVX512BW >+ ? "vmovdqu8" >+ : "%vmovdqu") >+ : "%vmovdqa"); >+ break; >+ case E_HImode: >+ if (size == 64) >+ opcode = (misaligned_p >+ ? (TARGET_AVX512BW >+ ? "vmovdqu16" >+ : "vmovdqu64") >+ : "vmovdqa64"); >+ else if (evex_reg_p) >+ { >+ if (TARGET_AVX512VL) >+ opcode = (misaligned_p >+ ? (TARGET_AVX512BW >+ ? "vmovdqu16" >+ : "vmovdqu64") >+ : "vmovdqa64"); >+ } >+ else >+ opcode = (misaligned_p >+ ? (TARGET_AVX512BW >+ ? "vmovdqu16" >+ : "%vmovdqu") >+ : "%vmovdqa"); >+ break; >+ case E_SImode: >+ if (size == 64) >+ opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32"; >+ else if (evex_reg_p) >+ { >+ if (TARGET_AVX512VL) >+ opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32"; >+ } >+ else >+ opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; >+ break; >+ case E_DImode: >+ case E_TImode: >+ case E_OImode: >+ if (size == 64) >+ opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; >+ else if (evex_reg_p) >+ { >+ if (TARGET_AVX512VL) >+ opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; >+ } >+ else >+ opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa"; >+ break; >+ case E_XImode: >+ opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64"; >+ break; >+ default: >+ gcc_unreachable (); >+ } >+ } >+ else >+ gcc_unreachable (); >+ >+ if (!opcode) >+ { >+ /* NB: We get here only because we move xmm16-xmm31/ymm16-ymm31 >+ registers without AVX512VL by using zmm register move. */ >+ if (!evex_reg_p >+ || TARGET_AVX512VL >+ || memory_operand (operands[0], mode) >+ || memory_operand (operands[1], mode)) >+ gcc_unreachable (); >+ size = 64; >+ switch (type) >+ { >+ case opcode_int: >+ opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32"; >+ break; >+ case opcode_float: >+ opcode = misaligned_p ? "%vmovups" : "%vmovaps"; >+ break; >+ case opcode_double: >+ opcode = misaligned_p ? "%vmovupd" : "%vmovapd"; >+ break; >+ } >+ } >+ >+ switch (size) >+ { >+ case 64: >+ snprintf (buf, sizeof (buf), "%s\t{%%g1, %%g0|%%g0, %%g1}", >+ opcode); >+ break; >+ case 32: >+ snprintf (buf, sizeof (buf), "%s\t{%%t1, %%t0|%%t0, %%t1}", >+ opcode); >+ break; >+ case 16: >+ snprintf (buf, sizeof (buf), "%s\t{%%x1, %%x0|%%x0, %%x1}", >+ opcode); >+ break; >+ default: >+ gcc_unreachable (); >+ } >+ return buf; >+} >+ >+/* Return the template of the TYPE_SSEMOV instruction to move >+ operands[1] into operands[0]. */ >+ >+const char * >+ix86_output_ssemov (rtx_insn *insn, rtx *operands) >+{ >+ machine_mode mode = GET_MODE (operands[0]); >+ if (get_attr_type (insn) != TYPE_SSEMOV >+ || mode != GET_MODE (operands[1])) >+ gcc_unreachable (); >+ >+ enum attr_mode insn_mode = get_attr_mode (insn); >+ >+ switch (insn_mode) >+ { >+ case MODE_XI: >+ case MODE_V8DF: >+ case MODE_V16SF: >+ return ix86_get_ssemov (operands, 64, mode); >+ >+ case MODE_OI: >+ case MODE_V4DF: >+ case MODE_V8SF: >+ return ix86_get_ssemov (operands, 32, mode); >+ >+ case MODE_TI: >+ case MODE_V2DF: >+ case MODE_V4SF: >+ return ix86_get_ssemov (operands, 16, mode); >+ >+ case MODE_DI: >+ /* Handle broken assemblers that require movd instead of movq. */ >+ if (!HAVE_AS_IX86_INTERUNIT_MOVQ >+ && (GENERAL_REG_P (operands[0]) >+ || GENERAL_REG_P (operands[1]))) >+ return "%vmovd\t{%1, %0|%0, %1}"; >+ else >+ return "%vmovq\t{%1, %0|%0, %1}"; >+ >+ case MODE_V2SF: >+ if (TARGET_AVX && REG_P (operands[0])) >+ return "vmovlps\t{%1, %d0|%d0, %1}"; >+ else >+ return "%vmovlps\t{%1, %0|%0, %1}"; >+ >+ case MODE_DF: >+ if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) >+ return "vmovsd\t{%d1, %0|%0, %d1}"; >+ else >+ return "%vmovsd\t{%1, %0|%0, %1}"; >+ >+ case MODE_V1DF: >+ gcc_assert (!TARGET_AVX); >+ return "movlpd\t{%1, %0|%0, %1}"; >+ >+ case MODE_SI: >+ return "%vmovd\t{%1, %0|%0, %1}"; >+ >+ case MODE_SF: >+ if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) >+ return "vmovss\t{%d1, %0|%0, %d1}"; >+ else >+ return "%vmovss\t{%1, %0|%0, %1}"; >+ >+ default: >+ gcc_unreachable (); >+ } >+} >+ > /* Returns true if OP contains a symbol reference */ > > bool >diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md >index 9948f77fca5..40ed93dc804 100644 >--- a/gcc/config/i386/i386.md >+++ b/gcc/config/i386/i386.md >@@ -1878,11 +1878,7 @@ > return standard_sse_constant_opcode (insn, operands); > > case TYPE_SSEMOV: >- if (misaligned_operand (operands[0], XImode) >- || misaligned_operand (operands[1], XImode)) >- return "vmovdqu32\t{%1, %0|%0, %1}"; >- else >- return "vmovdqa32\t{%1, %0|%0, %1}"; >+ return ix86_output_ssemov (insn, operands); > > default: > gcc_unreachable (); >@@ -1905,25 +1901,7 @@ > return standard_sse_constant_opcode (insn, operands); > > case TYPE_SSEMOV: >- if (misaligned_operand (operands[0], OImode) >- || misaligned_operand (operands[1], OImode)) >- { >- if (get_attr_mode (insn) == MODE_V8SF) >- return "vmovups\t{%1, %0|%0, %1}"; >- else if (get_attr_mode (insn) == MODE_XI) >- return "vmovdqu32\t{%1, %0|%0, %1}"; >- else >- return "vmovdqu\t{%1, %0|%0, %1}"; >- } >- else >- { >- if (get_attr_mode (insn) == MODE_V8SF) >- return "vmovaps\t{%1, %0|%0, %1}"; >- else if (get_attr_mode (insn) == MODE_XI) >- return "vmovdqa32\t{%1, %0|%0, %1}"; >- else >- return "vmovdqa\t{%1, %0|%0, %1}"; >- } >+ return ix86_output_ssemov (insn, operands); > > default: > gcc_unreachable (); >@@ -1933,13 +1911,7 @@ > (set_attr "type" "sselog1,sselog1,ssemov,ssemov") > (set_attr "prefix" "vex") > (set (attr "mode") >- (cond [(ior (match_operand 0 "ext_sse_reg_operand") >- (match_operand 1 "ext_sse_reg_operand")) >- (const_string "XI") >- (and (eq_attr "alternative" "1") >- (match_test "TARGET_AVX512VL")) >- (const_string "XI") >- (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL") >+ (cond [(ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL") > (and (eq_attr "alternative" "3") > (match_test "TARGET_SSE_TYPELESS_STORES"))) > (const_string "V8SF") >@@ -1965,27 +1937,7 @@ > return standard_sse_constant_opcode (insn, operands); > > case TYPE_SSEMOV: >- /* TDmode values are passed as TImode on the stack. Moving them >- to stack may result in unaligned memory access. */ >- if (misaligned_operand (operands[0], TImode) >- || misaligned_operand (operands[1], TImode)) >- { >- if (get_attr_mode (insn) == MODE_V4SF) >- return "%vmovups\t{%1, %0|%0, %1}"; >- else if (get_attr_mode (insn) == MODE_XI) >- return "vmovdqu32\t{%1, %0|%0, %1}"; >- else >- return "%vmovdqu\t{%1, %0|%0, %1}"; >- } >- else >- { >- if (get_attr_mode (insn) == MODE_V4SF) >- return "%vmovaps\t{%1, %0|%0, %1}"; >- else if (get_attr_mode (insn) == MODE_XI) >- return "vmovdqa32\t{%1, %0|%0, %1}"; >- else >- return "%vmovdqa\t{%1, %0|%0, %1}"; >- } >+ return ix86_output_ssemov (insn, operands); > > default: > gcc_unreachable (); >@@ -2012,12 +1964,6 @@ > (set (attr "mode") > (cond [(eq_attr "alternative" "0,1") > (const_string "DI") >- (ior (match_operand 0 "ext_sse_reg_operand") >- (match_operand 1 "ext_sse_reg_operand")) >- (const_string "XI") >- (and (eq_attr "alternative" "3") >- (match_test "TARGET_AVX512VL")) >- (const_string "XI") > (ior (not (match_test "TARGET_SSE2")) > (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL") > (and (eq_attr "alternative" "5") >@@ -2091,31 +2037,7 @@ > return standard_sse_constant_opcode (insn, operands); > > case TYPE_SSEMOV: >- switch (get_attr_mode (insn)) >- { >- case MODE_DI: >- /* Handle broken assemblers that require movd instead of movq. */ >- if (!HAVE_AS_IX86_INTERUNIT_MOVQ >- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1]))) >- return "%vmovd\t{%1, %0|%0, %1}"; >- return "%vmovq\t{%1, %0|%0, %1}"; >- >- case MODE_TI: >- /* Handle AVX512 registers set. */ >- if (EXT_REX_SSE_REG_P (operands[0]) >- || EXT_REX_SSE_REG_P (operands[1])) >- return "vmovdqa64\t{%1, %0|%0, %1}"; >- return "%vmovdqa\t{%1, %0|%0, %1}"; >- >- case MODE_V2SF: >- gcc_assert (!TARGET_AVX); >- return "movlps\t{%1, %0|%0, %1}"; >- case MODE_V4SF: >- return "%vmovaps\t{%1, %0|%0, %1}"; >- >- default: >- gcc_unreachable (); >- } >+ return ix86_output_ssemov (insn, operands); > > case TYPE_SSECVT: > if (SSE_REG_P (operands[0])) >@@ -2201,10 +2123,7 @@ > (cond [(eq_attr "alternative" "2") > (const_string "SI") > (eq_attr "alternative" "12,13") >- (cond [(ior (match_operand 0 "ext_sse_reg_operand") >- (match_operand 1 "ext_sse_reg_operand")) >- (const_string "TI") >- (ior (not (match_test "TARGET_SSE2")) >+ (cond [(ior (not (match_test "TARGET_SSE2")) > (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")) > (const_string "V4SF") > (match_test "TARGET_AVX") >@@ -2327,25 +2246,7 @@ > gcc_unreachable (); > > case TYPE_SSEMOV: >- switch (get_attr_mode (insn)) >- { >- case MODE_SI: >- return "%vmovd\t{%1, %0|%0, %1}"; >- case MODE_TI: >- return "%vmovdqa\t{%1, %0|%0, %1}"; >- case MODE_XI: >- return "vmovdqa32\t{%g1, %g0|%g0, %g1}"; >- >- case MODE_V4SF: >- return "%vmovaps\t{%1, %0|%0, %1}"; >- >- case MODE_SF: >- gcc_assert (!TARGET_AVX); >- return "movss\t{%1, %0|%0, %1}"; >- >- default: >- gcc_unreachable (); >- } >+ return ix86_output_ssemov (insn, operands); > > case TYPE_MMX: > return "pxor\t%0, %0"; >@@ -2411,10 +2312,7 @@ > (cond [(eq_attr "alternative" "2,3") > (const_string "DI") > (eq_attr "alternative" "8,9") >- (cond [(ior (match_operand 0 "ext_sse_reg_operand") >- (match_operand 1 "ext_sse_reg_operand")) >- (const_string "XI") >- (ior (not (match_test "TARGET_SSE2")) >+ (cond [(ior (not (match_test "TARGET_SSE2")) > (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")) > (const_string "V4SF") > (match_test "TARGET_AVX") >@@ -3234,31 +3132,7 @@ > return standard_sse_constant_opcode (insn, operands); > > case TYPE_SSEMOV: >- /* Handle misaligned load/store since we >- don't have movmisaligntf pattern. */ >- if (misaligned_operand (operands[0], TFmode) >- || misaligned_operand (operands[1], TFmode)) >- { >- if (get_attr_mode (insn) == MODE_V4SF) >- return "%vmovups\t{%1, %0|%0, %1}"; >- else if (TARGET_AVX512VL >- && (EXT_REX_SSE_REG_P (operands[0]) >- || EXT_REX_SSE_REG_P (operands[1]))) >- return "vmovdqu64\t{%1, %0|%0, %1}"; >- else >- return "%vmovdqu\t{%1, %0|%0, %1}"; >- } >- else >- { >- if (get_attr_mode (insn) == MODE_V4SF) >- return "%vmovaps\t{%1, %0|%0, %1}"; >- else if (TARGET_AVX512VL >- && (EXT_REX_SSE_REG_P (operands[0]) >- || EXT_REX_SSE_REG_P (operands[1]))) >- return "vmovdqa64\t{%1, %0|%0, %1}"; >- else >- return "%vmovdqa\t{%1, %0|%0, %1}"; >- } >+ return ix86_output_ssemov (insn, operands); > > case TYPE_MULTI: > return "#"; >@@ -3411,37 +3285,7 @@ > return standard_sse_constant_opcode (insn, operands); > > case TYPE_SSEMOV: >- switch (get_attr_mode (insn)) >- { >- case MODE_DF: >- if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) >- return "vmovsd\t{%d1, %0|%0, %d1}"; >- return "%vmovsd\t{%1, %0|%0, %1}"; >- >- case MODE_V4SF: >- return "%vmovaps\t{%1, %0|%0, %1}"; >- case MODE_V8DF: >- return "vmovapd\t{%g1, %g0|%g0, %g1}"; >- case MODE_V2DF: >- return "%vmovapd\t{%1, %0|%0, %1}"; >- >- case MODE_V2SF: >- gcc_assert (!TARGET_AVX); >- return "movlps\t{%1, %0|%0, %1}"; >- case MODE_V1DF: >- gcc_assert (!TARGET_AVX); >- return "movlpd\t{%1, %0|%0, %1}"; >- >- case MODE_DI: >- /* Handle broken assemblers that require movd instead of movq. */ >- if (!HAVE_AS_IX86_INTERUNIT_MOVQ >- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1]))) >- return "%vmovd\t{%1, %0|%0, %1}"; >- return "%vmovq\t{%1, %0|%0, %1}"; >- >- default: >- gcc_unreachable (); >- } >+ return ix86_output_ssemov (insn, operands); > > default: > gcc_unreachable (); >@@ -3497,9 +3341,6 @@ > (eq_attr "alternative" "12,16") > (cond [(not (match_test "TARGET_SSE2")) > (const_string "V4SF") >- (and (match_test "TARGET_AVX512F") >- (not (match_test "TARGET_PREFER_AVX256"))) >- (const_string "XI") > (match_test "TARGET_AVX") > (const_string "V2DF") > (match_test "optimize_function_for_size_p (cfun)") >@@ -3515,12 +3356,7 @@ > > /* movaps is one byte shorter for non-AVX targets. */ > (eq_attr "alternative" "13,17") >- (cond [(and (ior (not (match_test "TARGET_PREFER_AVX256")) >- (not (match_test "TARGET_AVX512VL"))) >- (ior (match_operand 0 "ext_sse_reg_operand") >- (match_operand 1 "ext_sse_reg_operand"))) >- (const_string "V8DF") >- (ior (not (match_test "TARGET_SSE2")) >+ (cond [(ior (not (match_test "TARGET_SSE2")) > (match_test > "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")) > (const_string "V4SF") > (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY") >@@ -3612,24 +3448,7 @@ > return standard_sse_constant_opcode (insn, operands); > > case TYPE_SSEMOV: >- switch (get_attr_mode (insn)) >- { >- case MODE_SF: >- if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) >- return "vmovss\t{%d1, %0|%0, %d1}"; >- return "%vmovss\t{%1, %0|%0, %1}"; >- >- case MODE_V16SF: >- return "vmovaps\t{%g1, %g0|%g0, %g1}"; >- case MODE_V4SF: >- return "%vmovaps\t{%1, %0|%0, %1}"; >- >- case MODE_SI: >- return "%vmovd\t{%1, %0|%0, %1}"; >- >- default: >- gcc_unreachable (); >- } >+ return ix86_output_ssemov (insn, operands); > > case TYPE_MMXMOV: > switch (get_attr_mode (insn)) >@@ -3702,12 +3521,7 @@ > better to maintain the whole registers in single format > to avoid problems on using packed logical operations. */ > (eq_attr "alternative" "6") >- (cond [(and (ior (not (match_test "TARGET_PREFER_AVX256")) >- (not (match_test "TARGET_AVX512VL"))) >- (ior (match_operand 0 "ext_sse_reg_operand") >- (match_operand 1 "ext_sse_reg_operand"))) >- (const_string "V16SF") >- (ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY") >+ (cond [(ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY") > (match_test "TARGET_SSE_SPLIT_REGS")) > (const_string "V4SF") > ] >diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md >index c1e0f2c411e..9c3808338d3 100644 >--- a/gcc/config/i386/mmx.md >+++ b/gcc/config/i386/mmx.md >@@ -115,29 +115,7 @@ > return standard_sse_constant_opcode (insn, operands); > > case TYPE_SSEMOV: >- switch (get_attr_mode (insn)) >- { >- case MODE_DI: >- /* Handle broken assemblers that require movd instead of movq. */ >- if (!HAVE_AS_IX86_INTERUNIT_MOVQ >- && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1]))) >- return "%vmovd\t{%1, %0|%0, %1}"; >- return "%vmovq\t{%1, %0|%0, %1}"; >- case MODE_TI: >- return "%vmovdqa\t{%1, %0|%0, %1}"; >- case MODE_XI: >- return "vmovdqa64\t{%g1, %g0|%g0, %g1}"; >- >- case MODE_V2SF: >- if (TARGET_AVX && REG_P (operands[0])) >- return "vmovlps\t{%1, %0, %0|%0, %0, %1}"; >- return "%vmovlps\t{%1, %0|%0, %1}"; >- case MODE_V4SF: >- return "%vmovaps\t{%1, %0|%0, %1}"; >- >- default: >- gcc_unreachable (); >- } >+ return ix86_output_ssemov (insn, operands); > > default: > gcc_unreachable (); >@@ -186,10 +164,7 @@ > (cond [(eq_attr "alternative" "2") > (const_string "SI") > (eq_attr "alternative" "11,12") >- (cond [(ior (match_operand 0 "ext_sse_reg_operand") >- (match_operand 1 "ext_sse_reg_operand")) >- (const_string "XI") >- (match_test "<MODE>mode == V2SFmode") >+ (cond [(match_test "<MODE>mode == V2SFmode") > (const_string "V4SF") > (ior (not (match_test "TARGET_SSE2")) > (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")) >diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md >index 865947debcc..99226e86436 100644 >--- a/gcc/config/i386/predicates.md >+++ b/gcc/config/i386/predicates.md >@@ -54,11 +54,6 @@ > (and (match_code "reg") > (match_test "SSE_REGNO_P (REGNO (op))"))) > >-;; True if the operand is an AVX-512 new register. >-(define_predicate "ext_sse_reg_operand" >- (and (match_code "reg") >- (match_test "EXT_REX_SSE_REGNO_P (REGNO (op))"))) >- > ;; Return true if op is a QImode register. > (define_predicate "any_QIreg_operand" > (and (match_code "reg") >diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md >index 5dc0930ac1f..2014f0a7832 100644 >--- a/gcc/config/i386/sse.md >+++ b/gcc/config/i386/sse.md >@@ -982,98 +982,7 @@ > return standard_sse_constant_opcode (insn, operands); > > case TYPE_SSEMOV: >- /* There is no evex-encoded vmov* for sizes smaller than 64-bytes >- in avx512f, so we need to use workarounds, to access sse registers >- 16-31, which are evex-only. In avx512vl we don't need workarounds. */ >- if (TARGET_AVX512F && <MODE_SIZE> < 64 && !TARGET_AVX512VL >- && (EXT_REX_SSE_REG_P (operands[0]) >- || EXT_REX_SSE_REG_P (operands[1]))) >- { >- if (memory_operand (operands[0], <MODE>mode)) >- { >- if (<MODE_SIZE> == 32) >- return "vextract<shuffletype>64x4\t{$0x0, %g1, %0|%0, %g1, >0x0}"; >- else if (<MODE_SIZE> == 16) >- return "vextract<shuffletype>32x4\t{$0x0, %g1, %0|%0, %g1, >0x0}"; >- else >- gcc_unreachable (); >- } >- else if (memory_operand (operands[1], <MODE>mode)) >- { >- if (<MODE_SIZE> == 32) >- return "vbroadcast<shuffletype>64x4\t{%1, %g0|%g0, %1}"; >- else if (<MODE_SIZE> == 16) >- return "vbroadcast<shuffletype>32x4\t{%1, %g0|%g0, %1}"; >- else >- gcc_unreachable (); >- } >- else >- /* Reg -> reg move is always aligned. Just use wider move. */ >- switch (get_attr_mode (insn)) >- { >- case MODE_V8SF: >- case MODE_V4SF: >- return "vmovaps\t{%g1, %g0|%g0, %g1}"; >- case MODE_V4DF: >- case MODE_V2DF: >- return "vmovapd\t{%g1, %g0|%g0, %g1}"; >- case MODE_OI: >- case MODE_TI: >- return "vmovdqa64\t{%g1, %g0|%g0, %g1}"; >- default: >- gcc_unreachable (); >- } >- } >- >- switch (get_attr_mode (insn)) >- { >- case MODE_V16SF: >- case MODE_V8SF: >- case MODE_V4SF: >- if (misaligned_operand (operands[0], <MODE>mode) >- || misaligned_operand (operands[1], <MODE>mode)) >- return "%vmovups\t{%1, %0|%0, %1}"; >- else >- return "%vmovaps\t{%1, %0|%0, %1}"; >- >- case MODE_V8DF: >- case MODE_V4DF: >- case MODE_V2DF: >- if (misaligned_operand (operands[0], <MODE>mode) >- || misaligned_operand (operands[1], <MODE>mode)) >- return "%vmovupd\t{%1, %0|%0, %1}"; >- else >- return "%vmovapd\t{%1, %0|%0, %1}"; >- >- case MODE_OI: >- case MODE_TI: >- if (misaligned_operand (operands[0], <MODE>mode) >- || misaligned_operand (operands[1], <MODE>mode)) >- return TARGET_AVX512VL >- && (<MODE>mode == V4SImode >- || <MODE>mode == V2DImode >- || <MODE>mode == V8SImode >- || <MODE>mode == V4DImode >- || TARGET_AVX512BW) >- ? "vmovdqu<ssescalarsize>\t{%1, %0|%0, %1}" >- : "%vmovdqu\t{%1, %0|%0, %1}"; >- else >- return TARGET_AVX512VL ? "vmovdqa64\t{%1, %0|%0, %1}" >- : "%vmovdqa\t{%1, %0|%0, %1}"; >- case MODE_XI: >- if (misaligned_operand (operands[0], <MODE>mode) >- || misaligned_operand (operands[1], <MODE>mode)) >- return (<MODE>mode == V16SImode >- || <MODE>mode == V8DImode >- || TARGET_AVX512BW) >- ? "vmovdqu<ssescalarsize>\t{%1, %0|%0, %1}" >- : "vmovdqu64\t{%1, %0|%0, %1}"; >- else >- return "vmovdqa64\t{%1, %0|%0, %1}"; >- >- default: >- gcc_unreachable (); >- } >+ return ix86_output_ssemov (insn, operands); > > default: > gcc_unreachable (); >@@ -1082,10 +991,7 @@ > [(set_attr "type" "sselog1,sselog1,ssemov,ssemov") > (set_attr "prefix" "maybe_vex") > (set (attr "mode") >- (cond [(and (eq_attr "alternative" "1") >- (match_test "TARGET_AVX512VL")) >- (const_string "<sseinsnmode>") >- (and (match_test "<MODE_SIZE> == 16") >+ (cond [(and (match_test "<MODE_SIZE> == 16") > (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL") > (and (eq_attr "alternative" "3") > (match_test "TARGET_SSE_TYPELESS_STORES")))) >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-2a.c >b/gcc/testsuite/gcc.target/i386/pr89229-2a.c >new file mode 100644 >index 00000000000..0cf78039481 >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-2a.c >@@ -0,0 +1,15 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512" } */ >+ >+typedef __int128 __m128t __attribute__ ((__vector_size__ (16), >+ __may_alias__)); >+ >+__m128t >+foo1 (void) >+{ >+ register __int128 xmm16 __asm ("xmm16") = (__int128) -1; >+ asm volatile ("" : "+v" (xmm16)); >+ return (__m128t) xmm16; >+} >+ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-2b.c >b/gcc/testsuite/gcc.target/i386/pr89229-2b.c >new file mode 100644 >index 00000000000..8d5d6c41d30 >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-2b.c >@@ -0,0 +1,13 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */ >+ >+typedef __int128 __m128t __attribute__ ((__vector_size__ (16), >+ __may_alias__)); >+ >+__m128t >+foo1 (void) >+{ >+ register __int128 xmm16 __asm ("xmm16") = (__int128) -1; /* { dg-error >"register specified for 'xmm16'" } */ >+ asm volatile ("" : "+v" (xmm16)); >+ return (__m128t) xmm16; >+} >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-2c.c >b/gcc/testsuite/gcc.target/i386/pr89229-2c.c >new file mode 100644 >index 00000000000..218da46dcd0 >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-2c.c >@@ -0,0 +1,6 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */ >+ >+#include "pr89229-2a.c" >+ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-3a.c >b/gcc/testsuite/gcc.target/i386/pr89229-3a.c >new file mode 100644 >index 00000000000..fd56f447016 >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-3a.c >@@ -0,0 +1,17 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512" } */ >+ >+extern int i; >+ >+int >+foo1 (void) >+{ >+ register int xmm16 __asm ("xmm16") = i; >+ asm volatile ("" : "+v" (xmm16)); >+ register int xmm17 __asm ("xmm17") = xmm16; >+ asm volatile ("" : "+v" (xmm17)); >+ return xmm17; >+} >+ >+/* { dg-final { scan-assembler-times >"vmovdqa32\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-3b.c >b/gcc/testsuite/gcc.target/i386/pr89229-3b.c >new file mode 100644 >index 00000000000..9265fc0354b >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-3b.c >@@ -0,0 +1,6 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */ >+ >+#include "pr89229-3a.c" >+ >+/* { dg-final { scan-assembler-times >"vmovdqa32\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-3c.c >b/gcc/testsuite/gcc.target/i386/pr89229-3c.c >new file mode 100644 >index 00000000000..d3fdf1ee273 >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-3c.c >@@ -0,0 +1,7 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */ >+ >+#include "pr89229-3a.c" >+ >+/* { dg-final { scan-assembler-times >"vmovdqa32\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4a.c >b/gcc/testsuite/gcc.target/i386/pr89229-4a.c >new file mode 100644 >index 00000000000..cb9b071e873 >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-4a.c >@@ -0,0 +1,17 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */ >+ >+extern long long i; >+ >+long long >+foo1 (void) >+{ >+ register long long xmm16 __asm ("xmm16") = i; >+ asm volatile ("" : "+v" (xmm16)); >+ register long long xmm17 __asm ("xmm17") = xmm16; >+ asm volatile ("" : "+v" (xmm17)); >+ return xmm17; >+} >+ >+/* { dg-final { scan-assembler-times >"vmovdqa64\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4b.c >b/gcc/testsuite/gcc.target/i386/pr89229-4b.c >new file mode 100644 >index 00000000000..023e81253a0 >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-4b.c >@@ -0,0 +1,6 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */ >+ >+#include "pr89229-4a.c" >+ >+/* { dg-final { scan-assembler-times >"vmovdqa32\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4c.c >b/gcc/testsuite/gcc.target/i386/pr89229-4c.c >new file mode 100644 >index 00000000000..e02eb37c16d >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-4c.c >@@ -0,0 +1,7 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */ >+ >+#include "pr89229-4a.c" >+ >+/* { dg-final { scan-assembler-times >"vmovdqa64\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5a.c >b/gcc/testsuite/gcc.target/i386/pr89229-5a.c >new file mode 100644 >index 00000000000..856115b2f5a >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-5a.c >@@ -0,0 +1,16 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512" } */ >+ >+extern float d; >+ >+void >+foo1 (float x) >+{ >+ register float xmm16 __asm ("xmm16") = x; >+ asm volatile ("" : "+v" (xmm16)); >+ register float xmm17 __asm ("xmm17") = xmm16; >+ asm volatile ("" : "+v" (xmm17)); >+ d = xmm17; >+} >+ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5b.c >b/gcc/testsuite/gcc.target/i386/pr89229-5b.c >new file mode 100644 >index 00000000000..cb0f3b55ccc >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-5b.c >@@ -0,0 +1,6 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */ >+ >+#include "pr89229-5a.c" >+ >+/* { dg-final { scan-assembler-times >"vmovaps\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5c.c >b/gcc/testsuite/gcc.target/i386/pr89229-5c.c >new file mode 100644 >index 00000000000..529a520133c >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-5c.c >@@ -0,0 +1,6 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */ >+ >+#include "pr89229-5a.c" >+ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6a.c >b/gcc/testsuite/gcc.target/i386/pr89229-6a.c >new file mode 100644 >index 00000000000..f88d7c8d74c >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-6a.c >@@ -0,0 +1,16 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512" } */ >+ >+extern double d; >+ >+void >+foo1 (double x) >+{ >+ register double xmm16 __asm ("xmm16") = x; >+ asm volatile ("" : "+v" (xmm16)); >+ register double xmm17 __asm ("xmm17") = xmm16; >+ asm volatile ("" : "+v" (xmm17)); >+ d = xmm17; >+} >+ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6b.c >b/gcc/testsuite/gcc.target/i386/pr89229-6b.c >new file mode 100644 >index 00000000000..316d85d921e >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-6b.c >@@ -0,0 +1,6 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */ >+ >+#include "pr89229-6a.c" >+ >+/* { dg-final { scan-assembler-times >"vmovapd\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6c.c >b/gcc/testsuite/gcc.target/i386/pr89229-6c.c >new file mode 100644 >index 00000000000..7a4d254670c >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-6c.c >@@ -0,0 +1,6 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */ >+ >+#include "pr89229-6a.c" >+ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7a.c >b/gcc/testsuite/gcc.target/i386/pr89229-7a.c >new file mode 100644 >index 00000000000..fcb85c366b6 >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-7a.c >@@ -0,0 +1,16 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512" } */ >+ >+extern __float128 d; >+ >+void >+foo1 (__float128 x) >+{ >+ register __float128 xmm16 __asm ("xmm16") = x; >+ asm volatile ("" : "+v" (xmm16)); >+ register __float128 xmm17 __asm ("xmm17") = xmm16; >+ asm volatile ("" : "+v" (xmm17)); >+ d = xmm17; >+} >+ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7b.c >b/gcc/testsuite/gcc.target/i386/pr89229-7b.c >new file mode 100644 >index 00000000000..37eb83c783b >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-7b.c >@@ -0,0 +1,12 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */ >+ >+extern __float128 d; >+ >+void >+foo1 (__float128 x) >+{ >+ register __float128 xmm16 __asm ("xmm16") = x; /* { dg-error "register >specified for 'xmm16'" } */ >+ asm volatile ("" : "+v" (xmm16)); >+ d = xmm16; >+} >diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7c.c >b/gcc/testsuite/gcc.target/i386/pr89229-7c.c >new file mode 100644 >index 00000000000..e37ff2bf5bd >--- /dev/null >+++ b/gcc/testsuite/gcc.target/i386/pr89229-7c.c >@@ -0,0 +1,6 @@ >+/* { dg-do compile { target { ! ia32 } } } */ >+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */ >+ >+#include "pr89229-7a.c" >+ >+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */ >-- >2.20.1 >