Hi Jeff,

> The pattern's operand 0 explicitly allows MEMs as do the constraints.
> So forcing the operand into a register just seems like it's papering
> over the real problem.


The added of force_reg code is address the problem preduced after address the 
error combine.
The more restrict condtion of the pattern forbidden mem->mem pattern which 
will
produced in -O0. I think the implementation forgot to do this force_reg 
operation before
when doing the intrinis expansion The reason this problem isn't exposed before 
is because
the reload pass will converts mem->mem to mem->reg; reg->mem based on 
the constraint.


> I wonder if we should just remove the memory destination from this
> pattern.  Ultimately isn't that case just trying to optimize a 
constant
> store into memory -- perhaps we just need a distinct pattern for that.
> We generally try to avoid that for movXX patterns, but this seems a bit
> different.


The pattern like scalar mov pattern, need to block mem->mem case.
I think mem->reg, reg->mem, reg->reg patterns are defined in the
same insn is more readable, I wonder how you feel about that?
And there's another `*mov<mode&gt;_whole` pattern that needs to be
restricted here as well, I'll try to send a separate patch to address that
like bellow.


(define_insn "*mov<mode&gt;_whole"
&nbsp; [(set (match_operand:V_WHOLE 0 "reg_or_mem_operand" "=vr, m,vr")
        (match_operand:V_WHOLE 1 "reg_or_mem_operand" "&nbsp; m,vr,vr"))]
&nbsp; "TARGET_VECTOR"
&nbsp; ...)


Change to:


(define_insn "*mov<mode&gt;_whole"
&nbsp; [(set (match_operand:V_WHOLE 0 "reg_or_mem_operand" "=vr, m,vr")
        (match_operand:V_WHOLE 1 "reg_or_mem_operand" "&nbsp; m,vr,vr"))]
&nbsp; "TARGET_VECTOR &amp;&amp; (register_operand (operands[0], <MODE&gt;mode)
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|| 
register_operand(operands[1], <MODE&gt;mode))"
&nbsp; ...)



&gt; This comment doesn't make sense in conjuction with your earlier details.
&gt; In particular combine doesn't run at -O0, so your earlier comment that
&gt; combine creates the problem seems inconsistent with the comment above.

As the above says, the code addresses the problem which produced
after addressing the combine problem.


&gt; Umm, wow.&nbsp; I haven't thought deeply about this, but the complexity of
&gt; that insn condition is a huge red flag that our operand predicates
&gt; aren't correct for this pattern.


This condition is large because the vsetvl info need (compare to scalar mov or 
*mov<mode&gt;_whole pattern),
but I think this condition is enough clear to understand. Let me explain 
briefly.


&nbsp; &nbsp; (register_operand (operands[0], <MODE&gt;mode) &amp;&amp; MEM_P 
(operands[3]))
&nbsp; &nbsp; || (MEM_P (operands[0]) &amp;&amp; register_operand(operands[3], 
<MODE&gt;mode))


This two conditons mean allow mem-&gt;reg and reg-&gt;mem pattern.


&nbsp; &nbsp; (register_operand (operands[0], <MODE&gt;mode) &amp;&amp; 
satisfies_constraint_Wc1 (operands[1]))


This condition mean the mask must be all trues for reg-&gt;reg_or_imm pattern 
since
reg-&gt;reg insn doen't support mask operand.


Best,
Lehua


------------------&nbsp;Original&nbsp;------------------
From:                                                                           
                                             "Jeff Law"                         
                                                           
<jeffreya...@gmail.com&gt;;
Date:&nbsp;Wed, Aug 9, 2023 00:10 AM
To:&nbsp;"Lehua 
Ding"<lehua.d...@rivai.ai&gt;;"gcc-patches"<gcc-patches@gcc.gnu.org&gt;;
Cc:&nbsp;"juzhe.zhong"<juzhe.zh...@rivai.ai&gt;;"rdapp.gcc"<rdapp....@gmail.com&gt;;"kito.cheng"<kito.ch...@gmail.com&gt;;"palmer"<pal...@rivosinc.com&gt;;
Subject:&nbsp;Re: [PATCH] RISC-V: Fix error combine of pred_mov pattern



On 8/8/23 05:57, Lehua Ding wrote:
&gt; Hi,
&gt; 
&gt; This patch fix PR110943 which will produce some error code. This is because
&gt; the error combine of some pred_mov pattern. Consider this code:
&gt; 
&gt; ```
&gt; #include <riscv_vector.h&gt;
&gt; 
&gt; void foo9 (void *base, void *out, size_t vl)
&gt; {
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int64_t scalar = *(int64_t*)(base + 100);
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; vint64m2_t v = __riscv_vmv_v_x_i64m2 (0, 1);
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *(vint64m2_t*)out = v;
&gt; }
&gt; ```
&gt; 
&gt; RTL before combine pass:
&gt; 
&gt; ```
&gt; (insn 11 10 12 2 (set (reg/v:RVVM2DI 134 [ v ])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
(if_then_else:RVVM2DI (unspec:RVVMF32BI [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_vector:RVVMF32BI repeat [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_int 1 [0x1])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 ])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_int 1 [0x1])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_int 2 [0x2]) repeated x2
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_int 0 [0])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (reg:SI 66 vl)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (reg:SI 67 vtype)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 ] UNSPEC_VPREDICATE)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_vector:RVVM2DI repeat [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_int 0 [0])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 ])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (unspec:RVVM2DI [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (reg:SI 0 zero)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 ] UNSPEC_VUNDEF))) "/app/example.c":6:20 1089 {pred_movrvvm2di})
&gt; (insn 14 13 0 2 (set (mem:RVVM2DI (reg/v/f:DI 136 [ out ]) [1 
MEM[(vint64m2_t *)out_4(D)]+0 S[32, 32] A128])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (reg/v:RVVM2DI 134 [ 
v ])) "/app/example.c":7:23 717 {*movrvvm2di_whole})
&gt; ```
&gt; 
&gt; RTL after combine pass:
&gt; ```
&gt; (insn 14 13 0 2 (set (mem:RVVM2DI (reg:DI 138) [1 MEM[(vint64m2_t 
*)out_4(D)]+0 S[32, 32] A128])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
(if_then_else:RVVM2DI (unspec:RVVMF32BI [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_vector:RVVMF32BI repeat [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_int 1 [0x1])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 ])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_int 1 [0x1])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_int 2 [0x2]) repeated x2
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_int 0 [0])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (reg:SI 66 vl)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (reg:SI 67 vtype)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 ] UNSPEC_VPREDICATE)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_vector:RVVM2DI repeat [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (const_int 0 [0])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 ])
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (unspec:RVVM2DI [
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (reg:SI 0 zero)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 ] UNSPEC_VUNDEF))) "/app/example.c":7:23 1089 {pred_movrvvm2di})
&gt; ```
&gt; 
&gt; This combine change the semantics of insn 14. I refine the conditon of 
@pred_mov
&gt; pattern to a more restrict. It's Ok for trunk?
&gt; 
&gt; Best,
&gt; Lehua
&gt; 
&gt; 
&gt;PR target/110943
&gt; 
&gt; gcc/ChangeLog:
&gt; 
&gt;* config/riscv/riscv-vector-builtins.cc 
(function_expander::function_expander):
&gt;&nbsp; force_reg mem operand.
&gt;* config/riscv/vector.md: Refine condition.
&gt; 
&gt; gcc/testsuite/ChangeLog:
&gt; 
&gt;* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Update.
&gt;* gcc.target/riscv/rvv/base/pr110943.c: New test.
So at a high level this doesn't look correct to me.

The pattern's operand 0 explicitly allows MEMs as do the constraints. 
So forcing the operand into a register just seems like it's papering 
over the real problem.

I wonder if we should just remove the memory destination from this 
pattern.&nbsp; Ultimately isn't that case just trying to optimize a constant 
store into memory -- perhaps we just need a distinct pattern for that. 
We generally try to avoid that for movXX patterns, but this seems a bit 
different.


&gt;&nbsp;&nbsp; create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/pr110943.c
&gt; 
&gt; diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
&gt; index 528dca7ae85..cd40fb2060f 100644
&gt; --- a/gcc/config/riscv/riscv-vector-builtins.cc
&gt; +++ b/gcc/config/riscv/riscv-vector-builtins.cc
&gt; @@ -3471,7 +3471,13 @@ function_expander::function_expander (const 
function_instance &amp;instance,
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; exp (exp_in), target (target_in), opno 
(0)
&gt;&nbsp;&nbsp; {
&gt;&nbsp;&nbsp;&nbsp;&nbsp; if (!function_returns_void_p ())
&gt; -&nbsp;&nbsp;&nbsp; create_output_operand (&amp;m_ops[opno++], target, 
TYPE_MODE (TREE_TYPE (exp)));
&gt; +&nbsp;&nbsp;&nbsp; {
&gt; +&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (target != NULL_RTX &amp;&amp; MEM_P 
(target))
&gt; + /* Use force_reg to prevent illegal mem-to-mem pattern on -O0.&nbsp; */
This comment doesn't make sense in conjuction with your earlier details. 
 In particular combine doesn't run at -O0, so your earlier comment that 
combine creates the problem seems inconsistent with the comment above.


&gt; diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
&gt; index e56a2bf4bed..f0484b1162c 100644
&gt; --- a/gcc/config/riscv/vector.md
&gt; +++ b/gcc/config/riscv/vector.md
&gt; @@ -1509,8 +1509,9 @@
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (reg:SI 
VTYPE_REGNUM)] UNSPEC_VPREDICATE)
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (match_operand:V_VLS 3 
"vector_move_operand"&nbsp;&nbsp; "&nbsp;&nbsp;&nbsp; 
m,&nbsp;&nbsp;&nbsp;&nbsp; m,&nbsp;&nbsp;&nbsp;&nbsp; m,&nbsp;&nbsp;&nbsp; 
vr,&nbsp;&nbsp;&nbsp; vr,&nbsp;&nbsp;&nbsp; vr, viWc0, viWc0")
&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (match_operand:V_VLS 2 
"vector_merge_operand"&nbsp; "&nbsp;&nbsp;&nbsp; 0,&nbsp;&nbsp;&nbsp; 
vu,&nbsp;&nbsp;&nbsp; vu,&nbsp;&nbsp;&nbsp; vu,&nbsp;&nbsp;&nbsp; 
vu,&nbsp;&nbsp;&nbsp;&nbsp; 0,&nbsp;&nbsp;&nbsp; vu,&nbsp;&nbsp;&nbsp;&nbsp; 
0")))]
&gt; -&nbsp; "TARGET_VECTOR &amp;&amp; (MEM_P (operands[0]) || MEM_P 
(operands[3])
&gt; -&nbsp;&nbsp; || CONST_VECTOR_P (operands[1]))"
&gt; +&nbsp; "TARGET_VECTOR &amp;&amp; ((register_operand (operands[0], 
<MODE&gt;mode) &amp;&amp; MEM_P (operands[3])) ||
&gt; 
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (MEM_P (operands[0]) &amp;&amp; register_operand (operands[3], <MODE&gt;mode)) 
||
&gt; 
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
 (register_operand (operands[0], <MODE&gt;mode) &amp;&amp; 
satisfies_constraint_Wc1 (operands[1])))"
Umm, wow.&nbsp; I haven't thought deeply about this, but the complexity of 
that insn condition is a huge red flag that our operand predicates 
aren't correct for this pattern.

From a formatting standpoint bring the wrapped operator down and 
indent.&nbsp; ie

 (condition 1
 || condition 2
 || (condition 3
 &amp;&amp; other test 4))


Jeff

Reply via email to