https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #44 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Richard Sandiford from comment #42) > Created attachment 57605 [details] > proof-of-concept patch to suppress peeling for gaps > > How about the attached? It records whether all accesses that require > peeling for gaps could instead have used gathers, and only retries when > that's true. It means that we retry for only 0.034% of calls to > vect_analyze_loop_1 in a build of SPEC2017 with -mcpu=neoverse-v1 -Ofast > -fomit-frame-pointer. I guess this idea would work, but as said full re-analysis shouldn't be required, instead "just" the updated cost on the affected loads/stores need to be recomputed? Of course this would require quite some implementation work. If we want to just fix this regression the approach looks sensible but it would be also applied to x86 which doesn't want to compare costs, right? I'm not sure the gather vs. permute costing there makes this a good idea for stage4?