>> Right now I don't see a need for this patch. No, we need this patch.
With this patch, this following case can be combine into vfwmul.vv: #define TEST_TYPE(TYPE1, TYPE2) \ __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 ( \ TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3, \ TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b, \ TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n) \ { \ for (int i = 0; i < n; i++) \ { \ dst[i] = (TYPE1) a[i] * (TYPE1) b[i]; \ dst2[i] = (TYPE1) a2[i] * (TYPE1) b[i]; \ dst3[i] = (TYPE1) a2[i] * (TYPE1) a[i]; \ dst4[i] = (TYPE1) a[i] * (TYPE1) b2[i]; \ } \ } TEST_TYPE (double, float) You should try this, then you will know I am saying. juzhe.zh...@rivai.ai From: Jeff Law Date: 2023-06-30 06:59 To: 钟居哲; gcc-patches CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc Subject: Re: [PATCH] RISC-V: Support vfwmul.vv combine lowering On 6/28/23 16:00, 钟居哲 wrote: > You can see here: > > https://godbolt.org/z/d78646hWb <https://godbolt.org/z/d78646hWb> > > The first case can't genreate vfwmul.vv but second case succeed. > > Failed to match this instruction: > (set (reg:VNx2DF 150 [ vect__11.50 ]) > (if_then_else:VNx2DF (unspec:VNx2BI [ > (const_vector:VNx2BI repeat [ > (const_int 1 [0x1]) > ]) > (reg:DI 153) > (const_int 2 [0x2]) repeated x2 > (const_int 1 [0x1]) > (const_int 7 [0x7]) > (reg:SI 66 vl) > (reg:SI 67 vtype) > (reg:SI 69 N/A) > ] UNSPEC_VPREDICATE) > (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 149 [ vect__5.45 ])) > (reg:VNx2DF 148 [ vect__8.49 ])) > (unspec:VNx2DF [ > (reg:SI 0 zero) > ] UNSPEC_VUNDEF))) Right. We try combining: 24 -> 27 25 -> 27 23, 24 -> 27 22, 25 -> 27 All of which fail, as expected. 24 -> 27 and 25-> 27 only put an extension on one operand of the mult. The other two try to substitute a float extend of an if-then-else which I fully expect to fail. All as expected. The next one that gets tried is: > Trying 25, 24 -> 27: > 25: r149:VNx2DF=float_extend(r141:VNx2SF) > REG_DEAD r141:VNx2SF > 24: r148:VNx2DF=float_extend(r139:VNx2SF) > REG_DEAD r139:VNx2SF > 27: > r150:VNx2DF={(unspec[const_vector,r153:DI,0x2,0x2,0x1,0x7,vl:SI,vtype:SI,N/A:SI] > 69)?r148:VNx2DF*r149:VNx2DF:unspec[zero:SI] 68} > REG_DEAD r149:VNx2DF > REG_DEAD r148:VNx2DF > REG_DEAD N/A:SI > REG_DEAD zero:SI > REG_EQUAL r148:VNx2DF*r149:VNx2DF > Successfully matched this instruction: > (set (reg:VNx2DF 150 [ vect__11.50 ]) > (if_then_else:VNx2DF (unspec:VNx2BI [ > (const_vector:VNx2BI repeat [ > (const_int 1 [0x1]) > ]) > (reg:DI 153) > (const_int 2 [0x2]) repeated x2 > (const_int 1 [0x1]) > (const_int 7 [0x7]) > (reg:SI 66 vl) > (reg:SI 67 vtype) > (reg:SI 69 N/A) > ] UNSPEC_VPREDICATE) > (mult:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 141 [ vect__4.44 ])) > (float_extend:VNx2DF (reg:VNx2SF 139 [ vect__7.48 ]))) > (unspec:VNx2DF [ > (reg:SI 0 zero) > ] UNSPEC_VUNDEF))) > allowing combination of insns 24, 25 and 27 > original costs 4 + 4 + 4 = 12 > replacement cost 4 Note how it replaced both operands of the mult with extended versions and the pattern matches, as expected. The point being that I don't think those helper patterns are needed to handle the problem you suggested they were there to handle. Combine knows how to handle multiple substitutions just fine. Right now I don't see a need for this patch. Jeff