This adds missing code to correctly set the counts of the exit blocks we create when building the CFG for a vectorized early break loop.
Tested as a series on aarch64-linux-gnu, arm-linux-gnueabihf, and x86_64-linux-gnu. OK for trunk? Thanks, Alex gcc/ChangeLog: PR tree-optimization/117790 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Set profile counts for {main,alt}_loop_exit_block. --- gcc/tree-vect-loop-manip.cc | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index 5d1b70aea43..53d36eaa25f 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -1686,6 +1686,16 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit, set_immediate_dominator (CDI_DOMINATORS, new_preheader, loop->header); + + /* Fix up the profile counts of the new exit blocks. + main_loop_exit_block was created by duplicating the + preheader, so needs its count scaling according to the main + exit edge's probability. The remaining count from the + preheader goes to the alt_loop_exit_block, since all + alternative exits have been redirected there. */ + main_loop_exit_block->count = loop_exit->count (); + alt_loop_exit_block->count + = preheader->count - main_loop_exit_block->count; } /* Adjust the epilog loop PHI entry values to continue iteration.