https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117974
--- Comment #7 from JuzheZhong <juzhe.zhong at rivai dot ai> --- (In reply to Vineet Gupta from comment #4) > (In reply to JuzheZhong from comment #2) > > We need to split all insns since some of them are not the ultimate RVV > > instruction pattern that depend on VL/VTYPE. > > > And I don't think the vsetvli should be keep close VLE, > > > Because ? > > > instead, They are > > redundant, I think the problem is VSETVLI PASS fail to eliminate them since > > look into the RTL IR before VSETVLI PASS: > > Are you really sure they are redundant ? It seems the unrolled loop iterator > is being manipulated via sub and then fed as input to VSETVL, whose output > then feeds to next iteration. So there does seem to be data dependency and > it might not be redundant after all. > > > > > https://godbolt.org/z/WdrjoYj49 > > > > (reg:DI 15 a5 [orig:139 _31 ] [139]) > > > > vsetvli a5,a1,e32,m1,tu,ma > > vle32.v v2,0(a0) > > sub a1,a1,a5 <-- input to next vsetvl > > sh2add a0,a5,a0 > > vfmacc.vv v1,v2,v2 > > vsetvli a5,a1,e32,m1,tu,ma <-- vsetvl output > > beq a1,zero,.L12 > > vle32.v v2,0(a0) > > sub a1,a1,a5 <-- used here > > sh2add a0,a5,a0 > .... Oh. I see. I missed the dependency here. I think Phase 3 early fusion should handle this scenario.