[For some reason this message didn't reach my gmail account]

> 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO
> operands to vector broadcast from an integer with AVX2.
> 2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which
> won't increase stack alignment requirement and blocks transformation by
> the combine pass.
> 3. Update PR 87767 tests to expect integer broadcast instead of broadcast
> from memory.
> 4. Update avx512f_cond_move.c to expect integer broadcast.

+  else if (TARGET_64BIT
+   && ix86_broadcast (val, GET_MODE_BITSIZE (DImode),
+      val_broadcast))
+    {
+      /* NB: MOVQ takes a 32-bit signed immediate operand.  */
+      if (trunc_int_for_mode (val_broadcast, SImode) != val_broadcast)
+ return nullptr;
+      broadcast_mode = DImode;
+    }
+  else
+    return nullptr;

We have MOVABS insn and movdi_internal knows when to switch between
MOVQ and MOVABS.

+  if (!ix86_expand_vector_init_duplicate (false, vector_mode, target,
+  GEN_INT (val_broadcast)))
+    gcc_unreachable ();

We are using:

bool ok = ix86_expand_vector_init_duplicate (...);
gcc_assert (ok);

idiom throughout i386/. Let's keep it this way.

+  if (REGNO (target) < FIRST_PSEUDO_REGISTER)
+    target = gen_rtx_REG (mode, REGNO (target));
+  else
+    target = convert_to_mode (mode, target, 1);
+

This is not needed. lowpart_subreg should do the trick when changing
mode of hard regs (also see comment for ix86_gen_scratch_sse_rtx).

+  rtx first;
+
+  if (can_create_pseudo_p ()
+      && GET_MODE_SIZE (mode) >= 16
+      && GET_MODE_CLASS (mode) == MODE_VECTOR_INT
+      && (MEM_P (op1)
+  && SYMBOL_REF_P (XEXP (op1, 0))
+  && CONSTANT_POOL_ADDRESS_P (XEXP (op1, 0)))
+      && (first = ix86_broadcast_from_integer_constant (mode, op1)))
+    {
+      /* Broadcast to XMM/YMM/ZMM register from an integer constant.  */
+      op1 = ix86_gen_scratch_sse_rtx (mode, false);
+      if (!ix86_expand_vector_init_duplicate (false, mode, op1, first))
+ gcc_unreachable ();
+      emit_move_insn (op0, op1);
+      return;

Please try to avoid assignment inside the condition. And also use
"gcc_assert (ok)" here.

+/* Return a scratch register in MODE for vector load and store.  If
+   CONSTANT_INT_BROADCAST is true, it is used to hold constant integer
+   broadcast result.  */
+
+rtx
+ix86_gen_scratch_sse_rtx (machine_mode mode,
+  bool constant_int_broadcast)

This function should always return hard reg, simply:

return gen_rtx_REG (mode, (TARGET_64BIT
  ? LAST_REX_SSE_REG : LAST_SSE_REG));

The complications with pseudo does not bring us anything (at the end
we need a hard reg anyway, and I guess reload knows quite well how to
avoid used temporary).

The function can then be renamed to ix86_gen_scratch_sse_reg.

* gcc.target/i386/avx512f-broadcast-pr87767-1.c: Expect integer
broadcast.
* gcc.target/i386/avx512f-broadcast-pr87767-5.c: Likewise.
* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: Likewise.
* gcc.target/i386/avx512vl-broadcast-pr87767-5.c: Likewise.
* gcc.target/i386/avx512f_cond_move.c: Also pass
-mprefer-vector-width=512 and expect integer broadcast.

No review for the above changes for AVX512 tests, someone else should
check if the new code is better here.

Uros.

Reply via email to