https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123343
--- Comment #11 from Tamar Christina <tnfchris at gcc dot gnu.org> --- (In reply to [email protected] from comment #7) > On Wed, 7 Jan 2026, tnfchris at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123343 > > > > --- Comment #6 from Tamar Christina <tnfchris at gcc dot gnu.org> --- > > (In reply to Richard Biener from comment #5) > > > (In reply to Zhongyao Chen from comment #4) > > > > Yes. Without unrolling, inner loop vectorization produces better asm. > > > > Could > > > > .cunrolli be made vectorization-aware to avoid unrolling when > > > > beneficial? > > > > Once unrolled, SLP has no way to recover the loop; what we want is inner > > > > loop vectorization only, not SLP. > > > > > > cunrolli first and foremost job is to remove abstraction, it's difficult > > > to > > > anticipate further optimization on the unrolled body, so - not easily I'd > > > say. > > > > > > BB SLP should work on this though (but as you said we first vectorize the > > > loop containing the code in an awkward way). > > > > Part of the reason I'm working on PR119187 is to hopefully be able to > > recover > > such cases like this where the pass ordering makes things awkward, so you'll > > end up with an SLP tree which contains both vector and scalar statements. > > > > But in this case I do think the dataref analysis could be improved to help? > > My understanding is that SLP (and thus unrolling) is generally difficult > with VLA vector ISAs and re-rolling while useful only works when all > statements of a loop are unrolled. But in this case only part of the > loop is. > It is, but I think what I was getting at is why do we treat this loop and others with fixed, known iteration count as VLA. i.e. this loop only processes 128-bits per iteration, and speccpu aside, we've had many cases where -msve-vector-bits=128 generated better code than -msve-vector-bits=scalable for these cases. I was actually looking into Alfie's capped VF patch for analyzing a particular version during costing assuming a fixed VL. As long as this is smaller or equal to the ISA's minimum VL it can be beneficial.
