https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117974

--- Comment #7 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
(In reply to Vineet Gupta from comment #4)
> (In reply to JuzheZhong from comment #2)
> > We need to split all insns since some of them are not the ultimate RVV
> > instruction pattern that depend on VL/VTYPE.
> 
> > And I don't think the vsetvli should be keep close VLE,
> 
> 
> Because ?
> 
> > instead, They are
> > redundant, I think the problem is VSETVLI PASS fail to eliminate them since
> > look into the RTL IR before VSETVLI PASS:
> 
> Are you really sure they are redundant ? It seems the unrolled loop iterator
> is being manipulated via sub and then fed as input to VSETVL, whose output
> then feeds to next iteration. So there does seem to be data dependency and
> it might not be redundant after all.
> 
> > 
> > https://godbolt.org/z/WdrjoYj49
> > 
> > (reg:DI 15 a5 [orig:139 _31 ] [139])
> > 
> >         vsetvli a5,a1,e32,m1,tu,ma  
> >         vle32.v v2,0(a0)
> >         sub     a1,a1,a5             <-- input to next vsetvl
> >         sh2add  a0,a5,a0
> >         vfmacc.vv       v1,v2,v2
> >         vsetvli a5,a1,e32,m1,tu,ma   <-- vsetvl output 
> >         beq     a1,zero,.L12
> >         vle32.v v2,0(a0)
> >         sub     a1,a1,a5             <-- used here
> >         sh2add  a0,a5,a0
> ....

Oh. I see. I missed the dependency here. I think Phase 3 early fusion should
handle this scenario.

Reply via email to