https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #24 from Tamar Christina <tnfchris at gcc dot gnu.org> --- (In reply to Richard Biener from comment #23) > Maybe easier to understand testcase: > > with -O3 -msse4.1 -fno-vect-cost-model we return 20 instead of 8. Adding > -fdisable-tree-cunroll avoids the issue. The upper bound we set on the > vector loop causes us to force taking the IV exit which continues > with i == (niter - 1) / VF * VF, but 'niter' is 20 here. yes,indeed, that's what my patch was arguing last time, but I didn't explain it well enough. I'm about to send out v2 (waiting for regtest to finish) which hopefully articulates this better.