On Fri, Jul 12, 2024 at 5:33 AM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> Hi Hongtao,
> Thanks for the review and pointing out the remaining uses of force_reg
> that I'd overlooked.  Here's a revised version of the patch that incorporates
> your feedback.  One minor change was that rather than using
> memory_operand, which as you point out also needs to include
> bcst_mem_operand, it's simpler to invert the logic to check for
> register_operand [i.e. the first operand must be a register].
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
Ok.
>
>
> 2024-07-11  Roger Sayle  <ro...@nextmovesoftware.com>
>             Hongtao Liu  <hongtao....@intel.com>
>
> gcc/ChangeLog
>         * config/i386/i386-expand.cc (ix86_broadcast_from_constant):
>         Use CONST_VECTOR_P instead of comparison against GET_CODE.
>         (ix86_gen_bcst_mem): Likewise.
>         (ix86_ternlog_leaf_p): Likewise.
>         (ix86_ternlog_operand_p): ix86_ternlog_leaf_p is always true for
>         vector_all_ones_operand.
>         (ix86_expand_ternlog_bin_op): Use CONST_VECTOR_P instead of
>         equality comparison against GET_CODE.  Replace call to force_reg
>         with gen_reg_rtx and emit_move_insn (for VEC_DUPLICATE broadcast).
>         Check for !register_operand instead of memory_operand.
>         Support CONST_VECTORs by calling force_const_mem.
>         (ix86_expand_ternlog): Fix indentation whitespace.
>         Allow ix86_ternlog_leaf_p as ix86_expand_ternlog_andnot's second
>         operand. Use CONST_VECTOR_P instead of equality against GET_CODE.
>         Use gen_reg_rtx and emit_move_insn for ~a, ~b and ~c cases.
>
>
> Thanks again,
> Roger
>
> > -----Original Message-----
> > From: Hongtao Liu <crazy...@gmail.com>
> > Sent: 08 July 2024 02:55
> > To: Roger Sayle <ro...@nextmovesoftware.com>
> > Cc: gcc-patches@gcc.gnu.org; Uros Bizjak <ubiz...@gmail.com>
> > Subject: Re: [x86 SSE PATCH] Some AVX512 ternlog expansion refinements.
> >
> > On Sun, Jul 7, 2024 at 5:00 PM Roger Sayle <ro...@nextmovesoftware.com>
> > wrote:
> > > Hi Hongtao,
> > > This should address concerns about the remaining use of force_reg.
> > >
> >  51@@ -25793,15 +25792,20 @@ ix86_expand_ternlog_binop (enum rtx_code
> > code, machine_mode mode,
> >  52   if (GET_MODE (op1) != mode)
> >  53     op1 = gen_lowpart (mode, op1);
> >  54
> >  55-  if (GET_CODE (op0) == CONST_VECTOR)  56+  if (CONST_VECTOR_P (op0))
> >  57     op0 = validize_mem (force_const_mem (mode, op0));
> >  58-  if (GET_CODE (op1) == CONST_VECTOR)  59+  if (CONST_VECTOR_P (op1))
> >  60     op1 = validize_mem (force_const_mem (mode, op1));
> >  61
> >  62   if (memory_operand (op0, mode))
> >  63     {
> >  64       if (memory_operand (op1, mode))
> >  65-       op0 = force_reg (mode, op0);
> >  66+       {
> >  67+         /* We can't use force_reg (op0, mode).  */
> >  68+         rtx reg = gen_reg_rtx (mode);
> >  69+         emit_move_insn (reg, op0);
> >  70+         op0 = reg;
> >  71+       }
> > Shouldn't we handle bcst_mem_operand instead of
> > memory_operand(bcst_memory_operand is not a memory_operand)?
> > so maybe
> > if (memory_operand (op0, mode0) || bcst_mem_operand (op0, mode0)
> >   if (memory_operand (op1, mode) || bcst_mem_operand (op1, mode0)?
> >  72       else
> >  73        std::swap (op0, op1);
> >  74     }
> >
> > Also there's force_reg in below 3 cases, are there any restrictions to avoid
> > bcst_mem_operand into them?
> > case 0x0f:  /* ~a */
> > case 0x33:  /* ~b */
> > case 0x33:  /* ~b */
> > ..
> >          if (!TARGET_64BIT && !register_operand (op2, mode))
> >            op2 = force_reg (mode, op2);
> >
> > --
> > BR,
> > Hongtao



-- 
BR,
Hongtao

Reply via email to