https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89229

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #45707|0                           |1
        is obsolete|                            |

--- Comment #24 from H.J. Lu <hjl.tools at gmail dot com> ---
Comment on attachment 45707
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45707
A new patch

>From fd7220a7551ee774614ca89574241813aae153b7 Mon Sep 17 00:00:00 2001
>From: "H.J. Lu" <hjl.to...@gmail.com>
>Date: Tue, 12 Feb 2019 13:25:41 -0800
>Subject: [PATCH] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move
>
>i386 backend has
>
>INT_MODE (OI, 32);
>INT_MODE (XI, 64);
>
>So, XI_MODE represents 64 INTEGER bytes = 64 * 8 = 512 bit operation,
>in case of const_1, all 512 bits set.
>
>We can load zeros with narrower instruction, (e.g. 256 bit by inherent
>zeroing of highpart in case of 128 bit xor), so TImode in this case.
>
>Some targets prefer V4SF mode, so they will emit float xorps for zeroing.
>
>sse.md has
>
>(define_insn "mov<mode>_internal"
>  [(set (match_operand:VMOVE 0 "nonimmediate_operand"
>         "=v,v ,v ,m")
>        (match_operand:VMOVE 1 "nonimmediate_or_sse_const_operand"
>         " C,BC,vm,v"))]
>....
>      /* There is no evex-encoded vmov* for sizes smaller than 64-bytes
>         in avx512f, so we need to use workarounds, to access sse registers
>         16-31, which are evex-only. In avx512vl we don't need workarounds.  */
>      if (TARGET_AVX512F && <MODE_SIZE> < 64 && !TARGET_AVX512VL
>          && (EXT_REX_SSE_REG_P (operands[0])
>              || EXT_REX_SSE_REG_P (operands[1])))
>        {
>          if (memory_operand (operands[0], <MODE>mode))
>            {
>              if (<MODE_SIZE> == 32)
>                return "vextract<shuffletype>64x4\t{$0x0, %g1, %0|%0, %g1, 
> 0x0}";
>              else if (<MODE_SIZE> == 16)
>                return "vextract<shuffletype>32x4\t{$0x0, %g1, %0|%0, %g1, 
> 0x0}";
>              else
>                gcc_unreachable ();
>            }
>...
>
>However, since ix86_hard_regno_mode_ok has
>
>     /* TODO check for QI/HI scalars.  */
>      /* AVX512VL allows sse regs16+ for 128/256 bit modes.  */
>      if (TARGET_AVX512VL
>          && (mode == OImode
>              || mode == TImode
>              || VALID_AVX256_REG_MODE (mode)
>              || VALID_AVX512VL_128_REG_MODE (mode)))
>        return true;
>
>      /* xmm16-xmm31 are only available for AVX-512.  */
>      if (EXT_REX_SSE_REGNO_P (regno))
>        return false;
>
>      if (TARGET_AVX512F && <MODE_SIZE> < 64 && !TARGET_AVX512VL
>          && (EXT_REX_SSE_REG_P (operands[0])
>              || EXT_REX_SSE_REG_P (operands[1])))
>
>is a dead code.
>
>All TYPE_SSEMOV vector moves are consolidated to ix86_output_ssemov:
>
>1. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE/AVX vector
>moves will be generated.
>2. If xmm16-xmm31/ymm16-ymm31 registers are used:
>   a. With AVX512VL, AVX512VL vector moves will be generated.
>   b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
>      move will be done with zmm register move.
>
>ext_sse_reg_operand is removed since it is no longer needed.
>
>gcc/
>
>       PR target/89229
>       * config/i386/i386-protos.h (ix86_output_ssemov): New prototype.
>       * config/i386/i386.c (ix86_get_ssemov): New function.
>       (ix86_output_ssemov): Likewise.
>       * config/i386/i386.md (*movxi_internal_avx512f): Call
>       ix86_output_ssemov for TYPE_SSEMOV.
>       (*movoi_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
>       Remove ext_sse_reg_operand and TARGET_AVX512VL check.
>       (*movti_internal): Likewise.
>       (*movdi_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
>       Remove ext_sse_reg_operand check.
>       (*movsi_internal): Likewise.
>       (*movtf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
>       (*movdf_internal): Call ix86_output_ssemov for TYPE_SSEMOV.
>       Remove TARGET_AVX512F, TARGET_PREFER_AVX256, TARGET_AVX512VL
>       and ext_sse_reg_operand check.
>       (*movsf_internal_avx): Call ix86_output_ssemov for TYPE_SSEMOV.
>       Remove TARGET_PREFER_AVX256, TARGET_AVX512VL and
>       ext_sse_reg_operand check.
>       * config/i386/mmx.md (MMXMODE:*mov<mode>_internal): Call
>       ix86_output_ssemov for TYPE_SSEMOV.  Remove ext_sse_reg_operand
>       check.
>       * config/i386/sse.md (VMOVE:mov<mode>_internal): Call
>       ix86_output_ssemov for TYPE_SSEMOV.  Remove TARGET_AVX512VL
>       check.
>       * config/i386/predicates.md (ext_sse_reg_operand): Removed.
>
>gcc/testsuite/
>
>       PR target/89229
>       * gcc.target/i386/pr89229-2a.c: New test.
>       * gcc.target/i386/pr89229-2b.c: Likewise.
>       * gcc.target/i386/pr89229-2c.c: Likewise.
>       * gcc.target/i386/pr89229-3a.c: Likewise.
>       * gcc.target/i386/pr89229-3b.c: Likewise.
>       * gcc.target/i386/pr89229-3c.c: Likewise.
>       * gcc.target/i386/pr89229-4a.c: Likewise.
>       * gcc.target/i386/pr89229-4b.c: Likewise.
>       * gcc.target/i386/pr89229-4c.c: Likewise.
>       * gcc.target/i386/pr89229-5a.c: Likewise.
>       * gcc.target/i386/pr89229-5b.c: Likewise.
>       * gcc.target/i386/pr89229-5c.c: Likewise.
>       * gcc.target/i386/pr89229-6a.c: Likewise.
>       * gcc.target/i386/pr89229-6b.c: Likewise.
>       * gcc.target/i386/pr89229-6c.c: Likewise.
>       * gcc.target/i386/pr89229-7a.c: Likewise.
>       * gcc.target/i386/pr89229-7b.c: Likewise.
>       * gcc.target/i386/pr89229-7c.c: Likewise.
>---
> gcc/config/i386/i386-protos.h              |   2 +
> gcc/config/i386/i386.c                     | 250 +++++++++++++++++++++
> gcc/config/i386/i386.md                    | 212 ++---------------
> gcc/config/i386/mmx.md                     |  29 +--
> gcc/config/i386/predicates.md              |   5 -
> gcc/config/i386/sse.md                     |  98 +-------
> gcc/testsuite/gcc.target/i386/pr89229-2a.c |  15 ++
> gcc/testsuite/gcc.target/i386/pr89229-2b.c |  13 ++
> gcc/testsuite/gcc.target/i386/pr89229-2c.c |   6 +
> gcc/testsuite/gcc.target/i386/pr89229-3a.c |  17 ++
> gcc/testsuite/gcc.target/i386/pr89229-3b.c |   6 +
> gcc/testsuite/gcc.target/i386/pr89229-3c.c |   7 +
> gcc/testsuite/gcc.target/i386/pr89229-4a.c |  17 ++
> gcc/testsuite/gcc.target/i386/pr89229-4b.c |   6 +
> gcc/testsuite/gcc.target/i386/pr89229-4c.c |   7 +
> gcc/testsuite/gcc.target/i386/pr89229-5a.c |  16 ++
> gcc/testsuite/gcc.target/i386/pr89229-5b.c |   6 +
> gcc/testsuite/gcc.target/i386/pr89229-5c.c |   6 +
> gcc/testsuite/gcc.target/i386/pr89229-6a.c |  16 ++
> gcc/testsuite/gcc.target/i386/pr89229-6b.c |   6 +
> gcc/testsuite/gcc.target/i386/pr89229-6c.c |   6 +
> gcc/testsuite/gcc.target/i386/pr89229-7a.c |  16 ++
> gcc/testsuite/gcc.target/i386/pr89229-7b.c |  12 +
> gcc/testsuite/gcc.target/i386/pr89229-7c.c |   6 +
> 24 files changed, 453 insertions(+), 327 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2a.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2b.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-2c.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3a.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3b.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-3c.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4a.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4b.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-4c.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5a.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5b.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-5c.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6a.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6b.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-6c.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7a.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7b.c
> create mode 100644 gcc/testsuite/gcc.target/i386/pr89229-7c.c
>
>diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
>index 2d600173917..27f5cc13abf 100644
>--- a/gcc/config/i386/i386-protos.h
>+++ b/gcc/config/i386/i386-protos.h
>@@ -38,6 +38,8 @@ extern void ix86_expand_split_stack_prologue (void);
> extern void ix86_output_addr_vec_elt (FILE *, int);
> extern void ix86_output_addr_diff_elt (FILE *, int, int);
> 
>+extern const char *ix86_output_ssemov (rtx_insn *, rtx *);
>+
> extern enum calling_abi ix86_cfun_abi (void);
> extern enum calling_abi ix86_function_type_abi (const_tree);
> 
>diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>index fd05873ba39..97d1ea4229e 100644
>--- a/gcc/config/i386/i386.c
>+++ b/gcc/config/i386/i386.c
>@@ -10281,6 +10281,256 @@ ix86_standard_x87sse_constant_load_p (const rtx_insn 
>*insn, rtx dst)
>   return true;
> }
> 
>+/* Return the opcode of the TYPE_SSEMOV instruction.  To move from
>+   or to xmm16-xmm31/ymm16-ymm31 registers, we either require
>+   TARGET_AVX512VL or it is a register to register move which can
>+   be done with zmm register move. */
>+
>+static const char *
>+ix86_get_ssemov (rtx *operands, unsigned size, machine_mode mode)
>+{
>+  static char buf[128];
>+  bool misaligned_p = (misaligned_operand (operands[0], mode)
>+                     || misaligned_operand (operands[1], mode));
>+  bool evex_reg_p = (EXT_REX_SSE_REG_P (operands[0])
>+                   || EXT_REX_SSE_REG_P (operands[1]));
>+  machine_mode scalar_mode = GET_MODE_INNER (mode);
>+
>+  const char *opcode = NULL;
>+  enum
>+    {
>+      opcode_int,
>+      opcode_float,
>+      opcode_double
>+    } type = opcode_int;
>+  if (SCALAR_FLOAT_MODE_P (scalar_mode))
>+    {
>+      switch (scalar_mode)
>+      {
>+      case E_SFmode:
>+        if (size == 64 || !evex_reg_p || TARGET_AVX512VL)
>+          opcode = misaligned_p ? "%vmovups" : "%vmovaps";
>+        else
>+          type = opcode_float;
>+        break;
>+      case E_DFmode:
>+        if (size == 64 || !evex_reg_p || TARGET_AVX512VL)
>+          opcode = misaligned_p ? "%vmovupd" : "%vmovapd";
>+        else
>+          type = opcode_double;
>+        break;
>+      case E_TFmode:
>+        if (size == 64)
>+          opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64";
>+        else if (evex_reg_p)
>+          {
>+            if (TARGET_AVX512VL)
>+              opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64";
>+          }
>+        else
>+          opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa";
>+        break;
>+      default:
>+        gcc_unreachable ();
>+      }
>+    }
>+  else if (SCALAR_INT_MODE_P (scalar_mode))
>+    {
>+      switch (scalar_mode)
>+      {
>+      case E_QImode:
>+        if (size == 64)
>+          opcode = (misaligned_p
>+                    ? (TARGET_AVX512BW
>+                       ? "vmovdqu8"
>+                       : "vmovdqu64")
>+                    : "vmovdqa64");
>+        else if (evex_reg_p)
>+          {
>+            if (TARGET_AVX512VL)
>+              opcode = (misaligned_p
>+                        ? (TARGET_AVX512BW
>+                           ? "vmovdqu8"
>+                           : "vmovdqu64")
>+                        : "vmovdqa64");
>+          }
>+        else
>+          opcode = (misaligned_p
>+                    ? (TARGET_AVX512BW
>+                       ? "vmovdqu8"
>+                       : "%vmovdqu")
>+                    : "%vmovdqa");
>+        break;
>+      case E_HImode:
>+        if (size == 64)
>+          opcode = (misaligned_p
>+                    ? (TARGET_AVX512BW
>+                       ? "vmovdqu16"
>+                       : "vmovdqu64")
>+                    : "vmovdqa64");
>+        else if (evex_reg_p)
>+          {
>+            if (TARGET_AVX512VL)
>+              opcode = (misaligned_p
>+                        ? (TARGET_AVX512BW
>+                           ? "vmovdqu16"
>+                           : "vmovdqu64")
>+                        : "vmovdqa64");
>+          }
>+        else
>+          opcode = (misaligned_p
>+                    ? (TARGET_AVX512BW
>+                       ? "vmovdqu16"
>+                       : "%vmovdqu")
>+                    : "%vmovdqa");
>+        break;
>+      case E_SImode:
>+        if (size == 64)
>+          opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32";
>+        else if (evex_reg_p)
>+          {
>+            if (TARGET_AVX512VL)
>+              opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32";
>+          }
>+        else
>+          opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa";
>+        break;
>+      case E_DImode:
>+      case E_TImode:
>+      case E_OImode:
>+        if (size == 64)
>+          opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64";
>+        else if (evex_reg_p)
>+          {
>+            if (TARGET_AVX512VL)
>+              opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64";
>+          }
>+        else
>+          opcode = misaligned_p ? "%vmovdqu" : "%vmovdqa";
>+        break;
>+      case E_XImode:
>+        opcode = misaligned_p ? "vmovdqu64" : "vmovdqa64";
>+        break;
>+      default:
>+        gcc_unreachable ();
>+      }
>+    }
>+  else
>+    gcc_unreachable ();
>+
>+  if (!opcode)
>+    {
>+      /* NB: We get here only because we move xmm16-xmm31/ymm16-ymm31
>+         registers without AVX512VL by using zmm register move.  */
>+      if (!evex_reg_p
>+        || TARGET_AVX512VL
>+        || memory_operand (operands[0], mode)
>+        || memory_operand (operands[1], mode))
>+      gcc_unreachable ();
>+      size = 64;
>+      switch (type)
>+      {
>+      case opcode_int:
>+        opcode = misaligned_p ? "vmovdqu32" : "vmovdqa32";
>+        break;
>+      case opcode_float:
>+        opcode = misaligned_p ? "%vmovups" : "%vmovaps";
>+        break;
>+      case opcode_double:
>+        opcode = misaligned_p ? "%vmovupd" : "%vmovapd";
>+        break;
>+      }
>+    }
>+
>+  switch (size)
>+    {
>+    case 64:
>+      snprintf (buf, sizeof (buf), "%s\t{%%g1, %%g0|%%g0, %%g1}",
>+              opcode);
>+      break;
>+    case 32:
>+      snprintf (buf, sizeof (buf), "%s\t{%%t1, %%t0|%%t0, %%t1}",
>+              opcode);
>+      break;
>+    case 16:
>+      snprintf (buf, sizeof (buf), "%s\t{%%x1, %%x0|%%x0, %%x1}",
>+              opcode);
>+      break;
>+    default:
>+      gcc_unreachable ();
>+    }
>+  return buf;
>+}
>+
>+/* Return the template of the TYPE_SSEMOV instruction to move
>+   operands[1] into operands[0].  */
>+
>+const char *
>+ix86_output_ssemov (rtx_insn *insn, rtx *operands)
>+{
>+  machine_mode mode = GET_MODE (operands[0]);
>+  if (get_attr_type (insn) != TYPE_SSEMOV
>+      || mode != GET_MODE (operands[1]))
>+    gcc_unreachable ();
>+
>+  enum attr_mode insn_mode = get_attr_mode (insn);
>+
>+  switch (insn_mode)
>+    {
>+    case MODE_XI:
>+    case MODE_V8DF:
>+    case MODE_V16SF:
>+      return ix86_get_ssemov (operands, 64, mode);
>+
>+    case MODE_OI:
>+    case MODE_V4DF:
>+    case MODE_V8SF:
>+      return ix86_get_ssemov (operands, 32, mode);
>+
>+    case MODE_TI:
>+    case MODE_V2DF:
>+    case MODE_V4SF:
>+      return ix86_get_ssemov (operands, 16, mode);
>+
>+    case MODE_DI:
>+      /* Handle broken assemblers that require movd instead of movq. */
>+      if (!HAVE_AS_IX86_INTERUNIT_MOVQ
>+        && (GENERAL_REG_P (operands[0])
>+            || GENERAL_REG_P (operands[1])))
>+      return "%vmovd\t{%1, %0|%0, %1}";
>+      else
>+      return "%vmovq\t{%1, %0|%0, %1}";
>+
>+    case MODE_V2SF:
>+      if (TARGET_AVX && REG_P (operands[0]))
>+      return "vmovlps\t{%1, %d0|%d0, %1}";
>+      else
>+      return "%vmovlps\t{%1, %0|%0, %1}";
>+
>+    case MODE_DF:
>+      if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>+      return "vmovsd\t{%d1, %0|%0, %d1}";
>+      else
>+      return "%vmovsd\t{%1, %0|%0, %1}";
>+
>+    case MODE_V1DF:
>+      gcc_assert (!TARGET_AVX);
>+       return "movlpd\t{%1, %0|%0, %1}";
>+
>+    case MODE_SI:
>+      return "%vmovd\t{%1, %0|%0, %1}";
>+
>+    case MODE_SF:
>+      if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>+      return "vmovss\t{%d1, %0|%0, %d1}";
>+      else
>+      return "%vmovss\t{%1, %0|%0, %1}";
>+
>+    default:
>+      gcc_unreachable ();
>+    }
>+}
>+
> /* Returns true if OP contains a symbol reference */
> 
> bool
>diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
>index 9948f77fca5..40ed93dc804 100644
>--- a/gcc/config/i386/i386.md
>+++ b/gcc/config/i386/i386.md
>@@ -1878,11 +1878,7 @@
>       return standard_sse_constant_opcode (insn, operands);
> 
>     case TYPE_SSEMOV:
>-      if (misaligned_operand (operands[0], XImode)
>-        || misaligned_operand (operands[1], XImode))
>-      return "vmovdqu32\t{%1, %0|%0, %1}";
>-      else
>-      return "vmovdqa32\t{%1, %0|%0, %1}";
>+      return ix86_output_ssemov (insn, operands);
> 
>     default:
>       gcc_unreachable ();
>@@ -1905,25 +1901,7 @@
>       return standard_sse_constant_opcode (insn, operands);
> 
>     case TYPE_SSEMOV:
>-      if (misaligned_operand (operands[0], OImode)
>-        || misaligned_operand (operands[1], OImode))
>-      {
>-        if (get_attr_mode (insn) == MODE_V8SF)
>-          return "vmovups\t{%1, %0|%0, %1}";
>-        else if (get_attr_mode (insn) == MODE_XI)
>-          return "vmovdqu32\t{%1, %0|%0, %1}";
>-        else
>-          return "vmovdqu\t{%1, %0|%0, %1}";
>-      }
>-      else
>-      {
>-        if (get_attr_mode (insn) == MODE_V8SF)
>-          return "vmovaps\t{%1, %0|%0, %1}";
>-        else if (get_attr_mode (insn) == MODE_XI)
>-          return "vmovdqa32\t{%1, %0|%0, %1}";
>-        else
>-          return "vmovdqa\t{%1, %0|%0, %1}";
>-      }
>+      return ix86_output_ssemov (insn, operands);
> 
>     default:
>       gcc_unreachable ();
>@@ -1933,13 +1911,7 @@
>    (set_attr "type" "sselog1,sselog1,ssemov,ssemov")
>    (set_attr "prefix" "vex")
>    (set (attr "mode")
>-      (cond [(ior (match_operand 0 "ext_sse_reg_operand")
>-                  (match_operand 1 "ext_sse_reg_operand"))
>-               (const_string "XI")
>-             (and (eq_attr "alternative" "1")
>-                  (match_test "TARGET_AVX512VL"))
>-               (const_string "XI")
>-             (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
>+      (cond [(ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
>                   (and (eq_attr "alternative" "3")
>                        (match_test "TARGET_SSE_TYPELESS_STORES")))
>                (const_string "V8SF")
>@@ -1965,27 +1937,7 @@
>       return standard_sse_constant_opcode (insn, operands);
> 
>     case TYPE_SSEMOV:
>-      /* TDmode values are passed as TImode on the stack.  Moving them
>-       to stack may result in unaligned memory access.  */
>-      if (misaligned_operand (operands[0], TImode)
>-        || misaligned_operand (operands[1], TImode))
>-      {
>-        if (get_attr_mode (insn) == MODE_V4SF)
>-          return "%vmovups\t{%1, %0|%0, %1}";
>-        else if (get_attr_mode (insn) == MODE_XI)
>-          return "vmovdqu32\t{%1, %0|%0, %1}";
>-        else
>-          return "%vmovdqu\t{%1, %0|%0, %1}";
>-      }
>-      else
>-      {
>-        if (get_attr_mode (insn) == MODE_V4SF)
>-          return "%vmovaps\t{%1, %0|%0, %1}";
>-        else if (get_attr_mode (insn) == MODE_XI)
>-          return "vmovdqa32\t{%1, %0|%0, %1}";
>-        else
>-          return "%vmovdqa\t{%1, %0|%0, %1}";
>-      }
>+      return ix86_output_ssemov (insn, operands);
> 
>     default:
>       gcc_unreachable ();
>@@ -2012,12 +1964,6 @@
>    (set (attr "mode")
>       (cond [(eq_attr "alternative" "0,1")
>                (const_string "DI")
>-             (ior (match_operand 0 "ext_sse_reg_operand")
>-                  (match_operand 1 "ext_sse_reg_operand"))
>-               (const_string "XI")
>-             (and (eq_attr "alternative" "3")
>-                  (match_test "TARGET_AVX512VL"))
>-               (const_string "XI")
>              (ior (not (match_test "TARGET_SSE2"))
>                   (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
>                        (and (eq_attr "alternative" "5")
>@@ -2091,31 +2037,7 @@
>       return standard_sse_constant_opcode (insn, operands);
> 
>     case TYPE_SSEMOV:
>-      switch (get_attr_mode (insn))
>-      {
>-      case MODE_DI:
>-        /* Handle broken assemblers that require movd instead of movq.  */
>-        if (!HAVE_AS_IX86_INTERUNIT_MOVQ
>-            && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
>-          return "%vmovd\t{%1, %0|%0, %1}";
>-        return "%vmovq\t{%1, %0|%0, %1}";
>-
>-      case MODE_TI:
>-        /* Handle AVX512 registers set.  */
>-        if (EXT_REX_SSE_REG_P (operands[0])
>-            || EXT_REX_SSE_REG_P (operands[1]))
>-          return "vmovdqa64\t{%1, %0|%0, %1}";
>-        return "%vmovdqa\t{%1, %0|%0, %1}";
>-
>-      case MODE_V2SF:
>-        gcc_assert (!TARGET_AVX);
>-        return "movlps\t{%1, %0|%0, %1}";
>-      case MODE_V4SF:
>-        return "%vmovaps\t{%1, %0|%0, %1}";
>-
>-      default:
>-        gcc_unreachable ();
>-      }
>+      return ix86_output_ssemov (insn, operands);
> 
>     case TYPE_SSECVT:
>       if (SSE_REG_P (operands[0]))
>@@ -2201,10 +2123,7 @@
>      (cond [(eq_attr "alternative" "2")
>             (const_string "SI")
>           (eq_attr "alternative" "12,13")
>-            (cond [(ior (match_operand 0 "ext_sse_reg_operand")
>-                        (match_operand 1 "ext_sse_reg_operand"))
>-                     (const_string "TI")
>-                   (ior (not (match_test "TARGET_SSE2"))
>+            (cond [(ior (not (match_test "TARGET_SSE2"))
>                         (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
>                      (const_string "V4SF")
>                    (match_test "TARGET_AVX")
>@@ -2327,25 +2246,7 @@
>       gcc_unreachable ();
> 
>     case TYPE_SSEMOV:
>-      switch (get_attr_mode (insn))
>-      {
>-      case MODE_SI:
>-          return "%vmovd\t{%1, %0|%0, %1}";
>-      case MODE_TI:
>-        return "%vmovdqa\t{%1, %0|%0, %1}";
>-      case MODE_XI:
>-        return "vmovdqa32\t{%g1, %g0|%g0, %g1}";
>-
>-      case MODE_V4SF:
>-        return "%vmovaps\t{%1, %0|%0, %1}";
>-
>-      case MODE_SF:
>-        gcc_assert (!TARGET_AVX);
>-          return "movss\t{%1, %0|%0, %1}";
>-
>-      default:
>-        gcc_unreachable ();
>-      }
>+      return ix86_output_ssemov (insn, operands);
> 
>     case TYPE_MMX:
>       return "pxor\t%0, %0";
>@@ -2411,10 +2312,7 @@
>      (cond [(eq_attr "alternative" "2,3")
>             (const_string "DI")
>           (eq_attr "alternative" "8,9")
>-            (cond [(ior (match_operand 0 "ext_sse_reg_operand")
>-                        (match_operand 1 "ext_sse_reg_operand"))
>-                     (const_string "XI")
>-                   (ior (not (match_test "TARGET_SSE2"))
>+            (cond [(ior (not (match_test "TARGET_SSE2"))
>                         (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
>                      (const_string "V4SF")
>                    (match_test "TARGET_AVX")
>@@ -3234,31 +3132,7 @@
>       return standard_sse_constant_opcode (insn, operands);
> 
>     case TYPE_SSEMOV:
>-      /* Handle misaligned load/store since we
>-         don't have movmisaligntf pattern. */
>-      if (misaligned_operand (operands[0], TFmode)
>-        || misaligned_operand (operands[1], TFmode))
>-      {
>-        if (get_attr_mode (insn) == MODE_V4SF)
>-          return "%vmovups\t{%1, %0|%0, %1}";
>-        else if (TARGET_AVX512VL
>-                 && (EXT_REX_SSE_REG_P (operands[0])
>-                     || EXT_REX_SSE_REG_P (operands[1])))
>-          return "vmovdqu64\t{%1, %0|%0, %1}";
>-        else
>-          return "%vmovdqu\t{%1, %0|%0, %1}";
>-      }
>-      else
>-      {
>-        if (get_attr_mode (insn) == MODE_V4SF)
>-          return "%vmovaps\t{%1, %0|%0, %1}";
>-        else if (TARGET_AVX512VL
>-                 && (EXT_REX_SSE_REG_P (operands[0])
>-                     || EXT_REX_SSE_REG_P (operands[1])))
>-          return "vmovdqa64\t{%1, %0|%0, %1}";
>-        else
>-          return "%vmovdqa\t{%1, %0|%0, %1}";
>-      }
>+      return ix86_output_ssemov (insn, operands);
> 
>     case TYPE_MULTI:
>       return "#";
>@@ -3411,37 +3285,7 @@
>       return standard_sse_constant_opcode (insn, operands);
> 
>     case TYPE_SSEMOV:
>-      switch (get_attr_mode (insn))
>-      {
>-      case MODE_DF:
>-        if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>-          return "vmovsd\t{%d1, %0|%0, %d1}";
>-        return "%vmovsd\t{%1, %0|%0, %1}";
>-
>-      case MODE_V4SF:
>-        return "%vmovaps\t{%1, %0|%0, %1}";
>-      case MODE_V8DF:
>-        return "vmovapd\t{%g1, %g0|%g0, %g1}";
>-      case MODE_V2DF:
>-        return "%vmovapd\t{%1, %0|%0, %1}";
>-
>-      case MODE_V2SF:
>-        gcc_assert (!TARGET_AVX);
>-        return "movlps\t{%1, %0|%0, %1}";
>-      case MODE_V1DF:
>-        gcc_assert (!TARGET_AVX);
>-        return "movlpd\t{%1, %0|%0, %1}";
>-
>-      case MODE_DI:
>-        /* Handle broken assemblers that require movd instead of movq.  */
>-        if (!HAVE_AS_IX86_INTERUNIT_MOVQ
>-            && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
>-          return "%vmovd\t{%1, %0|%0, %1}";
>-        return "%vmovq\t{%1, %0|%0, %1}";
>-
>-      default:
>-        gcc_unreachable ();
>-      }
>+      return ix86_output_ssemov (insn, operands);
> 
>     default:
>       gcc_unreachable ();
>@@ -3497,9 +3341,6 @@
>              (eq_attr "alternative" "12,16")
>                (cond [(not (match_test "TARGET_SSE2"))
>                         (const_string "V4SF")
>-                      (and (match_test "TARGET_AVX512F")
>-                        (not (match_test "TARGET_PREFER_AVX256")))
>-                        (const_string "XI")
>                       (match_test "TARGET_AVX")
>                         (const_string "V2DF")
>                       (match_test "optimize_function_for_size_p (cfun)")
>@@ -3515,12 +3356,7 @@
> 
>              /* movaps is one byte shorter for non-AVX targets.  */
>              (eq_attr "alternative" "13,17")
>-               (cond [(and (ior (not (match_test "TARGET_PREFER_AVX256"))
>-                                (not (match_test "TARGET_AVX512VL")))
>-                           (ior (match_operand 0 "ext_sse_reg_operand")
>-                                (match_operand 1 "ext_sse_reg_operand")))
>-                        (const_string "V8DF")
>-                      (ior (not (match_test "TARGET_SSE2"))
>+               (cond [(ior (not (match_test "TARGET_SSE2"))
>                            (match_test 
> "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
>                         (const_string "V4SF")
>                       (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
>@@ -3612,24 +3448,7 @@
>       return standard_sse_constant_opcode (insn, operands);
> 
>     case TYPE_SSEMOV:
>-      switch (get_attr_mode (insn))
>-      {
>-      case MODE_SF:
>-        if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>-          return "vmovss\t{%d1, %0|%0, %d1}";
>-        return "%vmovss\t{%1, %0|%0, %1}";
>-
>-      case MODE_V16SF:
>-        return "vmovaps\t{%g1, %g0|%g0, %g1}";
>-      case MODE_V4SF:
>-        return "%vmovaps\t{%1, %0|%0, %1}";
>-
>-      case MODE_SI:
>-        return "%vmovd\t{%1, %0|%0, %1}";
>-
>-      default:
>-        gcc_unreachable ();
>-      }
>+      return ix86_output_ssemov (insn, operands);
> 
>     case TYPE_MMXMOV:
>       switch (get_attr_mode (insn))
>@@ -3702,12 +3521,7 @@
>                 better to maintain the whole registers in single format
>                 to avoid problems on using packed logical operations.  */
>              (eq_attr "alternative" "6")
>-               (cond [(and (ior (not (match_test "TARGET_PREFER_AVX256"))
>-                                (not (match_test "TARGET_AVX512VL")))
>-                           (ior (match_operand 0 "ext_sse_reg_operand")
>-                                (match_operand 1 "ext_sse_reg_operand")))
>-                        (const_string "V16SF")
>-                      (ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
>+               (cond [(ior (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY")
>                            (match_test "TARGET_SSE_SPLIT_REGS"))
>                         (const_string "V4SF")
>                      ]
>diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
>index c1e0f2c411e..9c3808338d3 100644
>--- a/gcc/config/i386/mmx.md
>+++ b/gcc/config/i386/mmx.md
>@@ -115,29 +115,7 @@
>       return standard_sse_constant_opcode (insn, operands);
> 
>     case TYPE_SSEMOV:
>-      switch (get_attr_mode (insn))
>-      {
>-      case MODE_DI:
>-        /* Handle broken assemblers that require movd instead of movq.  */
>-        if (!HAVE_AS_IX86_INTERUNIT_MOVQ
>-            && (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
>-          return "%vmovd\t{%1, %0|%0, %1}";
>-        return "%vmovq\t{%1, %0|%0, %1}";
>-      case MODE_TI:
>-        return "%vmovdqa\t{%1, %0|%0, %1}";
>-      case MODE_XI:
>-        return "vmovdqa64\t{%g1, %g0|%g0, %g1}";
>-
>-      case MODE_V2SF:
>-        if (TARGET_AVX && REG_P (operands[0]))
>-          return "vmovlps\t{%1, %0, %0|%0, %0, %1}";
>-        return "%vmovlps\t{%1, %0|%0, %1}";
>-      case MODE_V4SF:
>-        return "%vmovaps\t{%1, %0|%0, %1}";
>-
>-      default:
>-        gcc_unreachable ();
>-      }
>+      return ix86_output_ssemov (insn, operands);
> 
>     default:
>       gcc_unreachable ();
>@@ -186,10 +164,7 @@
>      (cond [(eq_attr "alternative" "2")
>             (const_string "SI")
>           (eq_attr "alternative" "11,12")
>-            (cond [(ior (match_operand 0 "ext_sse_reg_operand")
>-                        (match_operand 1 "ext_sse_reg_operand"))
>-                      (const_string "XI")
>-                   (match_test "<MODE>mode == V2SFmode")
>+            (cond [(match_test "<MODE>mode == V2SFmode")
>                      (const_string "V4SF")
>                    (ior (not (match_test "TARGET_SSE2"))
>                         (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
>diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
>index 865947debcc..99226e86436 100644
>--- a/gcc/config/i386/predicates.md
>+++ b/gcc/config/i386/predicates.md
>@@ -54,11 +54,6 @@
>   (and (match_code "reg")
>        (match_test "SSE_REGNO_P (REGNO (op))")))
> 
>-;; True if the operand is an AVX-512 new register.
>-(define_predicate "ext_sse_reg_operand"
>-  (and (match_code "reg")
>-       (match_test "EXT_REX_SSE_REGNO_P (REGNO (op))")))
>-
> ;; Return true if op is a QImode register.
> (define_predicate "any_QIreg_operand"
>   (and (match_code "reg")
>diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
>index 5dc0930ac1f..2014f0a7832 100644
>--- a/gcc/config/i386/sse.md
>+++ b/gcc/config/i386/sse.md
>@@ -982,98 +982,7 @@
>       return standard_sse_constant_opcode (insn, operands);
> 
>     case TYPE_SSEMOV:
>-      /* There is no evex-encoded vmov* for sizes smaller than 64-bytes
>-       in avx512f, so we need to use workarounds, to access sse registers
>-       16-31, which are evex-only. In avx512vl we don't need workarounds.  */
>-      if (TARGET_AVX512F && <MODE_SIZE> < 64 && !TARGET_AVX512VL
>-        && (EXT_REX_SSE_REG_P (operands[0])
>-            || EXT_REX_SSE_REG_P (operands[1])))
>-      {
>-        if (memory_operand (operands[0], <MODE>mode))
>-          {
>-            if (<MODE_SIZE> == 32)
>-              return "vextract<shuffletype>64x4\t{$0x0, %g1, %0|%0, %g1, 
>0x0}";
>-            else if (<MODE_SIZE> == 16)
>-              return "vextract<shuffletype>32x4\t{$0x0, %g1, %0|%0, %g1, 
>0x0}";
>-            else
>-              gcc_unreachable ();
>-          }
>-        else if (memory_operand (operands[1], <MODE>mode))
>-          {
>-            if (<MODE_SIZE> == 32)
>-              return "vbroadcast<shuffletype>64x4\t{%1, %g0|%g0, %1}";
>-            else if (<MODE_SIZE> == 16)
>-              return "vbroadcast<shuffletype>32x4\t{%1, %g0|%g0, %1}";
>-            else
>-              gcc_unreachable ();
>-          }
>-        else
>-          /* Reg -> reg move is always aligned.  Just use wider move.  */
>-          switch (get_attr_mode (insn))
>-            {
>-            case MODE_V8SF:
>-            case MODE_V4SF:
>-              return "vmovaps\t{%g1, %g0|%g0, %g1}";
>-            case MODE_V4DF:
>-            case MODE_V2DF:
>-              return "vmovapd\t{%g1, %g0|%g0, %g1}";
>-            case MODE_OI:
>-            case MODE_TI:
>-              return "vmovdqa64\t{%g1, %g0|%g0, %g1}";
>-            default:
>-              gcc_unreachable ();
>-            }
>-      }
>-
>-      switch (get_attr_mode (insn))
>-      {
>-      case MODE_V16SF:
>-      case MODE_V8SF:
>-      case MODE_V4SF:
>-        if (misaligned_operand (operands[0], <MODE>mode)
>-            || misaligned_operand (operands[1], <MODE>mode))
>-          return "%vmovups\t{%1, %0|%0, %1}";
>-        else
>-          return "%vmovaps\t{%1, %0|%0, %1}";
>-
>-      case MODE_V8DF:
>-      case MODE_V4DF:
>-      case MODE_V2DF:
>-        if (misaligned_operand (operands[0], <MODE>mode)
>-            || misaligned_operand (operands[1], <MODE>mode))
>-          return "%vmovupd\t{%1, %0|%0, %1}";
>-        else
>-          return "%vmovapd\t{%1, %0|%0, %1}";
>-
>-      case MODE_OI:
>-      case MODE_TI:
>-        if (misaligned_operand (operands[0], <MODE>mode)
>-            || misaligned_operand (operands[1], <MODE>mode))
>-          return TARGET_AVX512VL
>-                 && (<MODE>mode == V4SImode
>-                     || <MODE>mode == V2DImode
>-                     || <MODE>mode == V8SImode
>-                     || <MODE>mode == V4DImode
>-                     || TARGET_AVX512BW)
>-                 ? "vmovdqu<ssescalarsize>\t{%1, %0|%0, %1}"
>-                 : "%vmovdqu\t{%1, %0|%0, %1}";
>-        else
>-          return TARGET_AVX512VL ? "vmovdqa64\t{%1, %0|%0, %1}"
>-                                 : "%vmovdqa\t{%1, %0|%0, %1}";
>-      case MODE_XI:
>-        if (misaligned_operand (operands[0], <MODE>mode)
>-            || misaligned_operand (operands[1], <MODE>mode))
>-          return (<MODE>mode == V16SImode
>-                  || <MODE>mode == V8DImode
>-                  || TARGET_AVX512BW)
>-                 ? "vmovdqu<ssescalarsize>\t{%1, %0|%0, %1}"
>-                 : "vmovdqu64\t{%1, %0|%0, %1}";
>-        else
>-          return "vmovdqa64\t{%1, %0|%0, %1}";
>-
>-      default:
>-        gcc_unreachable ();
>-      }
>+      return ix86_output_ssemov (insn, operands);
> 
>     default:
>       gcc_unreachable ();
>@@ -1082,10 +991,7 @@
>   [(set_attr "type" "sselog1,sselog1,ssemov,ssemov")
>    (set_attr "prefix" "maybe_vex")
>    (set (attr "mode")
>-      (cond [(and (eq_attr "alternative" "1")
>-                  (match_test "TARGET_AVX512VL"))
>-               (const_string "<sseinsnmode>")
>-             (and (match_test "<MODE_SIZE> == 16")
>+      (cond [(and (match_test "<MODE_SIZE> == 16")
>                   (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
>                        (and (eq_attr "alternative" "3")
>                             (match_test "TARGET_SSE_TYPELESS_STORES"))))
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-2a.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-2a.c
>new file mode 100644
>index 00000000000..0cf78039481
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-2a.c
>@@ -0,0 +1,15 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512" } */
>+
>+typedef __int128 __m128t __attribute__ ((__vector_size__ (16),
>+                                       __may_alias__));
>+
>+__m128t
>+foo1 (void)
>+{
>+  register __int128 xmm16 __asm ("xmm16") = (__int128) -1;
>+  asm volatile ("" : "+v" (xmm16));
>+  return (__m128t) xmm16;
>+}
>+
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-2b.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-2b.c
>new file mode 100644
>index 00000000000..8d5d6c41d30
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-2b.c
>@@ -0,0 +1,13 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
>+
>+typedef __int128 __m128t __attribute__ ((__vector_size__ (16),
>+                                       __may_alias__));
>+
>+__m128t
>+foo1 (void)
>+{
>+  register __int128 xmm16 __asm ("xmm16") = (__int128) -1; /* { dg-error 
>"register specified for 'xmm16'" } */
>+  asm volatile ("" : "+v" (xmm16));
>+  return (__m128t) xmm16;
>+}
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-2c.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-2c.c
>new file mode 100644
>index 00000000000..218da46dcd0
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-2c.c
>@@ -0,0 +1,6 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
>+
>+#include "pr89229-2a.c"
>+
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-3a.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-3a.c
>new file mode 100644
>index 00000000000..fd56f447016
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-3a.c
>@@ -0,0 +1,17 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512" } */
>+
>+extern int i;
>+
>+int
>+foo1 (void)
>+{
>+  register int xmm16 __asm ("xmm16") = i;
>+  asm volatile ("" : "+v" (xmm16));
>+  register int xmm17 __asm ("xmm17") = xmm16;
>+  asm volatile ("" : "+v" (xmm17));
>+  return xmm17;
>+}
>+
>+/* { dg-final { scan-assembler-times 
>"vmovdqa32\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-3b.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-3b.c
>new file mode 100644
>index 00000000000..9265fc0354b
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-3b.c
>@@ -0,0 +1,6 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
>+
>+#include "pr89229-3a.c"
>+
>+/* { dg-final { scan-assembler-times 
>"vmovdqa32\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-3c.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-3c.c
>new file mode 100644
>index 00000000000..d3fdf1ee273
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-3c.c
>@@ -0,0 +1,7 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
>+
>+#include "pr89229-3a.c"
>+
>+/* { dg-final { scan-assembler-times 
>"vmovdqa32\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4a.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-4a.c
>new file mode 100644
>index 00000000000..cb9b071e873
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-4a.c
>@@ -0,0 +1,17 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
>+
>+extern long long i;
>+
>+long long
>+foo1 (void)
>+{
>+  register long long xmm16 __asm ("xmm16") = i;
>+  asm volatile ("" : "+v" (xmm16));
>+  register long long xmm17 __asm ("xmm17") = xmm16;
>+  asm volatile ("" : "+v" (xmm17));
>+  return xmm17;
>+}
>+
>+/* { dg-final { scan-assembler-times 
>"vmovdqa64\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4b.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-4b.c
>new file mode 100644
>index 00000000000..023e81253a0
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-4b.c
>@@ -0,0 +1,6 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
>+
>+#include "pr89229-4a.c"
>+
>+/* { dg-final { scan-assembler-times 
>"vmovdqa32\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-4c.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-4c.c
>new file mode 100644
>index 00000000000..e02eb37c16d
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-4c.c
>@@ -0,0 +1,7 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
>+
>+#include "pr89229-4a.c"
>+
>+/* { dg-final { scan-assembler-times 
>"vmovdqa64\[^\n\r]*xmm1\[67]\[^\n\r]*xmm1\[67]" 1 } } */
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5a.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-5a.c
>new file mode 100644
>index 00000000000..856115b2f5a
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-5a.c
>@@ -0,0 +1,16 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512" } */
>+
>+extern float d;
>+
>+void
>+foo1 (float x)
>+{
>+  register float xmm16 __asm ("xmm16") = x;
>+  asm volatile ("" : "+v" (xmm16));
>+  register float xmm17 __asm ("xmm17") = xmm16;
>+  asm volatile ("" : "+v" (xmm17));
>+  d = xmm17;
>+}
>+
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5b.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-5b.c
>new file mode 100644
>index 00000000000..cb0f3b55ccc
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-5b.c
>@@ -0,0 +1,6 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
>+
>+#include "pr89229-5a.c"
>+
>+/* { dg-final { scan-assembler-times 
>"vmovaps\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-5c.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-5c.c
>new file mode 100644
>index 00000000000..529a520133c
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-5c.c
>@@ -0,0 +1,6 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
>+
>+#include "pr89229-5a.c"
>+
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6a.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-6a.c
>new file mode 100644
>index 00000000000..f88d7c8d74c
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-6a.c
>@@ -0,0 +1,16 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512" } */
>+
>+extern double d;
>+
>+void
>+foo1 (double x)
>+{
>+  register double xmm16 __asm ("xmm16") = x;
>+  asm volatile ("" : "+v" (xmm16));
>+  register double xmm17 __asm ("xmm17") = xmm16;
>+  asm volatile ("" : "+v" (xmm17));
>+  d = xmm17;
>+}
>+
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6b.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-6b.c
>new file mode 100644
>index 00000000000..316d85d921e
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-6b.c
>@@ -0,0 +1,6 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
>+
>+#include "pr89229-6a.c"
>+
>+/* { dg-final { scan-assembler-times 
>"vmovapd\[^\n\r]*zmm1\[67]\[^\n\r]*zmm1\[67]" 1 } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-6c.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-6c.c
>new file mode 100644
>index 00000000000..7a4d254670c
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-6c.c
>@@ -0,0 +1,6 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
>+
>+#include "pr89229-6a.c"
>+
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7a.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-7a.c
>new file mode 100644
>index 00000000000..fcb85c366b6
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-7a.c
>@@ -0,0 +1,16 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512" } */
>+
>+extern __float128 d;
>+
>+void
>+foo1 (__float128 x)
>+{
>+  register __float128 xmm16 __asm ("xmm16") = x;
>+  asm volatile ("" : "+v" (xmm16));
>+  register __float128 xmm17 __asm ("xmm17") = xmm16;
>+  asm volatile ("" : "+v" (xmm17));
>+  d = xmm17;
>+}
>+
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7b.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-7b.c
>new file mode 100644
>index 00000000000..37eb83c783b
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-7b.c
>@@ -0,0 +1,12 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mno-avx512vl" } */
>+
>+extern __float128 d;
>+
>+void
>+foo1 (__float128 x)
>+{
>+  register __float128 xmm16 __asm ("xmm16") = x; /* { dg-error "register 
>specified for 'xmm16'" } */
>+  asm volatile ("" : "+v" (xmm16));
>+  d = xmm16;
>+}
>diff --git a/gcc/testsuite/gcc.target/i386/pr89229-7c.c 
>b/gcc/testsuite/gcc.target/i386/pr89229-7c.c
>new file mode 100644
>index 00000000000..e37ff2bf5bd
>--- /dev/null
>+++ b/gcc/testsuite/gcc.target/i386/pr89229-7c.c
>@@ -0,0 +1,6 @@
>+/* { dg-do compile { target { ! ia32 } } } */
>+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
>+
>+#include "pr89229-7a.c"
>+
>+/* { dg-final { scan-assembler-not "%zmm\[0-9\]+" } } */
>-- 
>2.20.1
>

Reply via email to