This patch series aims to fix the consistency of the CFG profile for vectorized early break loops. I.e., if the CFG profile entering the vectorizer is consistent for a given early break loop, this series aims to ensure that the final vectorized loop also has a consistent profile.
For SPEC CPU 2017 on Neoverse V1, the series shows a 4.8% improvement in imagick with no significant regressions elsewhere. The series is structured as follows: - 1/4 adds missing code to set the counts of the exit blocks when creating the CFG for a vectorized early break loop. - 2/4 does most of the heavy lifting, adding new code to cfgloopmanip to correctly scale the profile of a loop with multiple exits, adjusting scale_loop_profile to use it and exposing new helpers in cfgloopmanip.h used by subsequent vect patches. - 3/4 makes various fixes to ensure adding an epilog skip edge preserves consistency of the profile. - 4/4 adjusts scale_profile_for_vect_loop to make it correctly handle loops with multiple exits. Many thanks to Andrew Carlotti who gave me significant help in debugging and discussing the ideas used in this patch series. Bootstrapped/regtested on aarch64-linux-gnu, arm-linux-gnueabihf, and x86_64-linux-gnu. Alex Coplan (4): vect: Set counts of early break exit blocks correctly [PR117790] cfgloopmanip: Add infrastructure for scaling of multi-exit loops [PR117790] vect: Ensure profile consistency when adding epilog guard [PR117790] vect: Fix scale_profile_for_vect_loop for multiple exits [PR117790] gcc/cfgloopmanip.cc | 309 ++++++++++++++++-- gcc/cfgloopmanip.h | 7 + .../gcc.dg/vect/vect-early-break-profile-1.c | 10 + .../gcc.dg/vect/vect-early-break-profile-2.c | 21 ++ gcc/tree-vect-loop-manip.cc | 58 +++- gcc/tree-vect-loop.cc | 21 +- 6 files changed, 378 insertions(+), 48 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break-profile-1.c create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break-profile-2.c