https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438
--- Comment #4 from JuzheZhong <juzhe.zhong at rivai dot ai> --- Oh. I see what you mean. I think it may not be the valid optimization. Since the following codes: .L3: vsetvli a5,a0,e32,m1,ta,ma slli a4,a5,2 vle32.v v1,0(a1) sub a0,a0,a5 vadd.vv v1,v1,v2 vse32.v v1,0(a2) add a1,a1,a4 vsetvli a5,zero,e32,m1,ta,ma --- > seems redundant add a2,a2,a4 vadd.vv v2,v2,v4 bne a0,zero,.L3 Suppose the VLEN = 8 elments. a0 is 13 in the last 2 iterations. If we remove the VLMAX vsetvl which seems redundant. We may have issues in some hardware. Since 13 elements, we can choose to process 6 elements int last second, and 7 elements in the last iteration. The VLMAX vadd.vv result is used by next iteration NOT the current iteration. Then, the vadd.vv will generate 6 elements to the last iteration which need 7 elements. Then it will cause a bug. So, it is not invalid optimization...