[For some reason this message didn't reach my gmail account] > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTO > operands to vector broadcast from an integer with AVX2. > 2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which > won't increase stack alignment requirement and blocks transformation by > the combine pass. > 3. Update PR 87767 tests to expect integer broadcast instead of broadcast > from memory. > 4. Update avx512f_cond_move.c to expect integer broadcast.
+ else if (TARGET_64BIT + && ix86_broadcast (val, GET_MODE_BITSIZE (DImode), + val_broadcast)) + { + /* NB: MOVQ takes a 32-bit signed immediate operand. */ + if (trunc_int_for_mode (val_broadcast, SImode) != val_broadcast) + return nullptr; + broadcast_mode = DImode; + } + else + return nullptr; We have MOVABS insn and movdi_internal knows when to switch between MOVQ and MOVABS. + if (!ix86_expand_vector_init_duplicate (false, vector_mode, target, + GEN_INT (val_broadcast))) + gcc_unreachable (); We are using: bool ok = ix86_expand_vector_init_duplicate (...); gcc_assert (ok); idiom throughout i386/. Let's keep it this way. + if (REGNO (target) < FIRST_PSEUDO_REGISTER) + target = gen_rtx_REG (mode, REGNO (target)); + else + target = convert_to_mode (mode, target, 1); + This is not needed. lowpart_subreg should do the trick when changing mode of hard regs (also see comment for ix86_gen_scratch_sse_rtx). + rtx first; + + if (can_create_pseudo_p () + && GET_MODE_SIZE (mode) >= 16 + && GET_MODE_CLASS (mode) == MODE_VECTOR_INT + && (MEM_P (op1) + && SYMBOL_REF_P (XEXP (op1, 0)) + && CONSTANT_POOL_ADDRESS_P (XEXP (op1, 0))) + && (first = ix86_broadcast_from_integer_constant (mode, op1))) + { + /* Broadcast to XMM/YMM/ZMM register from an integer constant. */ + op1 = ix86_gen_scratch_sse_rtx (mode, false); + if (!ix86_expand_vector_init_duplicate (false, mode, op1, first)) + gcc_unreachable (); + emit_move_insn (op0, op1); + return; Please try to avoid assignment inside the condition. And also use "gcc_assert (ok)" here. +/* Return a scratch register in MODE for vector load and store. If + CONSTANT_INT_BROADCAST is true, it is used to hold constant integer + broadcast result. */ + +rtx +ix86_gen_scratch_sse_rtx (machine_mode mode, + bool constant_int_broadcast) This function should always return hard reg, simply: return gen_rtx_REG (mode, (TARGET_64BIT ? LAST_REX_SSE_REG : LAST_SSE_REG)); The complications with pseudo does not bring us anything (at the end we need a hard reg anyway, and I guess reload knows quite well how to avoid used temporary). The function can then be renamed to ix86_gen_scratch_sse_reg. * gcc.target/i386/avx512f-broadcast-pr87767-1.c: Expect integer broadcast. * gcc.target/i386/avx512f-broadcast-pr87767-5.c: Likewise. * gcc.target/i386/avx512vl-broadcast-pr87767-1.c: Likewise. * gcc.target/i386/avx512vl-broadcast-pr87767-5.c: Likewise. * gcc.target/i386/avx512f_cond_move.c: Also pass -mprefer-vector-width=512 and expect integer broadcast. No review for the above changes for AVX512 tests, someone else should check if the new code is better here. Uros.