Hello Richard: On 03/06/24 7:47 pm, Richard Sandiford wrote: > Ajit Agarwal <aagar...@linux.ibm.com> writes: >> On 03/06/24 5:03 pm, Richard Sandiford wrote: >>> Ajit Agarwal <aagar...@linux.ibm.com> writes: >>>>> [...] >>>>> If it is intentional, what distinguishes things like vperm and xxinsertw >>>>> (and all other unspecs) from plain addition? >>>>> >>>>> [(set (match_operand:VSX_F 0 "vsx_register_operand" "=wa") >>>>> (plus:VSX_F (match_operand:VSX_F 1 "vsx_register_operand" "wa") >>>>> (match_operand:VSX_F 2 "vsx_register_operand" "wa")))] >>>>> >>>> >>>> Plain addition are not supported currently. >>>> We have not seen many cases with plain addition and this patch >>>> will not accept plain addition. >>>> >>>> >>>>> This is why the intention behind the patch is important. As it stands, >>>>> it isn't clear what criteria the patch is using to distinguish "valid" >>>>> fuse candidates from "invalid" ones. >>>>> >>>> >>>> Intention behind this patch all variants of UNSPEC instructions are >>>> supported and uses without UNSPEC are not supported in this patch. >>> >>> But why make the distinction this way though? UNSPEC is a very >>> GCC-specific concept. Whether something is an UNSPEC or some other >>> RTL code depends largely on historical accident. E.g. we have specific >>> codes for VEC_SELECT, VEC_MERGE, and VEC_DUPLICATE, but don't have one >>> for VEC_PERM (even for VEC_PERM_EXPR exists in gimple). >>> >>> It seems unlikely that GCC's choice about whether to represent something >>> as an UNSPEC or as another RTL code lines up neatly with the kind of >>> codegen decisions that a good assembly programmer would make. >>> >>> I suppose another way of asking is to turn this around and say: what >>> kind of uses are you trying to exclude? Presumably things are worse >>> if you remove this function override. But what makes them worse? >>> What kind of uses cause the regression? >>> >> >> Uses of fused load where load with low address uses are modified with load >> with high address uses. >> >> Similarly load with high address uses are modified with load low address >> uses. > > It sounds like something is going wrong the subreg updates. > Can you give an example of where this occurs? For instance... > >> This is the semantics of lxvp instructions which can occur through >> UNSPEC uses otherwise it breaks the functionality and seen failure >> in almost all vect regressions and SPEC benchmarks. > > ...could you take one of the simpler vect regressions, show the before > and after RTL, and why the transformation is wrong? >
Before the change: (insn 32 30 103 5 (set (reg:V16QI 127 [ _32 ]) (mem:V16QI (reg:DI 130 [ ivtmp.37 ]) [1 MEM <vector(8) short unsigned int> [(short unsigned int *)_55]+0 S16 A128])) {vsx_movv16qi_64bit} (nil)) (insn 103 32 135 5 (set (reg:V16QI 173 [ _32 ]) (mem:V16QI (plus:DI (reg:DI 130 [ ivtmp.37 ]) (const_int 16 [0x10])) [1 MEM <vector(8) short unsigned int> [(short unsigned int *)_55]+0 S16 A128])) {vsx_movv16qi_64bit} (nil)) (insn 135 103 34 5 (set (reg:DI 155) (plus:DI (reg:DI 130 [ ivtmp.37 ]) (const_int 16 [0x10]))) 66 {*adddi3} (nil)) (insn 34 135 104 5 (set (reg:V16QI 143 [ _27 ]) (unspec:V16QI [ (reg:V16QI 127 [ _32 ]) repeated x2 (reg:V16QI 152) ] UNSPEC_VPERM)) {altivec_vperm_v16qi_direct} (expr_list:REG_DEAD (reg:V16QI 127 [ _32 ]) (nil))) (insn 104 34 35 5 (set (reg:V16QI 174 [ _27 ]) (unspec:V16QI [ (reg:V16QI 173 [ _32 ]) repeated x2 (reg:V16QI 152) ] UNSPEC_VPERM)) {altivec_vperm_v16qi_direct} After the change: (insn 103 30 135 5 (set (reg:OO 127 [ _32 ]) (mem:OO (reg:DI 130 [ ivtmp.37 ]) [1 MEM <vector(8) short unsigned int> [(short unsigned int *)_55]+0 S16 A128])) {*movoo} (nil)) (insn 135 103 34 5 (set (reg:DI 155) (plus:DI (reg:DI 130 [ ivtmp.37 ]) (const_int 16 [0x10]))) 66 {*adddi3} (nil)) (insn 34 135 104 5 (set (reg:V16QI 143 [ _27 ]) (unspec:V16QI [ (subreg:V16QI (reg:OO 127 [ _32 ]) 16) (subreg:V16QI (reg:OO 127 [ _32 ]) 16) (reg:V16QI 152) ] UNSPEC_VPERM)) {altivec_vperm_v16qi_direct} (expr_list:REG_DEAD (reg:OO 127 [ _32 ]) (nil))) (insn 104 34 35 5 (set (reg:V16QI 174 [ _27 ]) (unspec:V16QI [ (subreg:V16QI (reg:OO 127 [ _32 ]) 0) (subreg:V16QI (reg:OO 127 [ _32 ]) 0) (reg:V16QI 152) ] UNSPEC_VPERM)) {altivec_vperm_v16qi_direct} After the change the tests passes. > Thanks, > Richard Thanks & Regards Ajit Thanks & Regards Ajit