On Fri, Jul 12, 2024 at 5:33 AM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > Hi Hongtao, > Thanks for the review and pointing out the remaining uses of force_reg > that I'd overlooked. Here's a revised version of the patch that incorporates > your feedback. One minor change was that rather than using > memory_operand, which as you point out also needs to include > bcst_mem_operand, it's simpler to invert the logic to check for > register_operand [i.e. the first operand must be a register]. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? Ok. > > > 2024-07-11 Roger Sayle <ro...@nextmovesoftware.com> > Hongtao Liu <hongtao....@intel.com> > > gcc/ChangeLog > * config/i386/i386-expand.cc (ix86_broadcast_from_constant): > Use CONST_VECTOR_P instead of comparison against GET_CODE. > (ix86_gen_bcst_mem): Likewise. > (ix86_ternlog_leaf_p): Likewise. > (ix86_ternlog_operand_p): ix86_ternlog_leaf_p is always true for > vector_all_ones_operand. > (ix86_expand_ternlog_bin_op): Use CONST_VECTOR_P instead of > equality comparison against GET_CODE. Replace call to force_reg > with gen_reg_rtx and emit_move_insn (for VEC_DUPLICATE broadcast). > Check for !register_operand instead of memory_operand. > Support CONST_VECTORs by calling force_const_mem. > (ix86_expand_ternlog): Fix indentation whitespace. > Allow ix86_ternlog_leaf_p as ix86_expand_ternlog_andnot's second > operand. Use CONST_VECTOR_P instead of equality against GET_CODE. > Use gen_reg_rtx and emit_move_insn for ~a, ~b and ~c cases. > > > Thanks again, > Roger > > > -----Original Message----- > > From: Hongtao Liu <crazy...@gmail.com> > > Sent: 08 July 2024 02:55 > > To: Roger Sayle <ro...@nextmovesoftware.com> > > Cc: gcc-patches@gcc.gnu.org; Uros Bizjak <ubiz...@gmail.com> > > Subject: Re: [x86 SSE PATCH] Some AVX512 ternlog expansion refinements. > > > > On Sun, Jul 7, 2024 at 5:00 PM Roger Sayle <ro...@nextmovesoftware.com> > > wrote: > > > Hi Hongtao, > > > This should address concerns about the remaining use of force_reg. > > > > > 51@@ -25793,15 +25792,20 @@ ix86_expand_ternlog_binop (enum rtx_code > > code, machine_mode mode, > > 52 if (GET_MODE (op1) != mode) > > 53 op1 = gen_lowpart (mode, op1); > > 54 > > 55- if (GET_CODE (op0) == CONST_VECTOR) 56+ if (CONST_VECTOR_P (op0)) > > 57 op0 = validize_mem (force_const_mem (mode, op0)); > > 58- if (GET_CODE (op1) == CONST_VECTOR) 59+ if (CONST_VECTOR_P (op1)) > > 60 op1 = validize_mem (force_const_mem (mode, op1)); > > 61 > > 62 if (memory_operand (op0, mode)) > > 63 { > > 64 if (memory_operand (op1, mode)) > > 65- op0 = force_reg (mode, op0); > > 66+ { > > 67+ /* We can't use force_reg (op0, mode). */ > > 68+ rtx reg = gen_reg_rtx (mode); > > 69+ emit_move_insn (reg, op0); > > 70+ op0 = reg; > > 71+ } > > Shouldn't we handle bcst_mem_operand instead of > > memory_operand(bcst_memory_operand is not a memory_operand)? > > so maybe > > if (memory_operand (op0, mode0) || bcst_mem_operand (op0, mode0) > > if (memory_operand (op1, mode) || bcst_mem_operand (op1, mode0)? > > 72 else > > 73 std::swap (op0, op1); > > 74 } > > > > Also there's force_reg in below 3 cases, are there any restrictions to avoid > > bcst_mem_operand into them? > > case 0x0f: /* ~a */ > > case 0x33: /* ~b */ > > case 0x33: /* ~b */ > > .. > > if (!TARGET_64BIT && !register_operand (op2, mode)) > > op2 = force_reg (mode, op2); > > > > -- > > BR, > > Hongtao
-- BR, Hongtao