On Thu, Jan 16, 2025 at 3:17 AM Eugene Rozenfeld <eugene.rozenf...@microsoft.com> wrote: > > I committed the patch to trunk. Is it ok to backport to gcc-12, gcc-13, and > gcc-14?
Yes. > -----Original Message----- > From: Richard Biener <richard.guent...@gmail.com> > Sent: Monday, January 13, 2025 11:22 PM > To: Eugene Rozenfeld <eugene.rozenf...@microsoft.com> > Cc: GCC-Patches-ML <gcc-patches@gcc.gnu.org>; Jan Hubicka <hubi...@ucw.cz>; > rvmal...@amazon.com > Subject: [EXTERNAL] Re: [PATCH] Fix setting of call graph node AutoFDO count > [PR116743] > > On Mon, Jan 13, 2025 at 10:47 PM Eugene Rozenfeld > <eugene.rozenf...@microsoft.com> wrote: > > > > We are initializing both the call graph node count and > > > > the entry block count of the function with the head_count value > > > > from the profile. > > > > > > > > Count propagation algorithm may refine the entry block count > > > > and we may end up with a case where the call graph node count > > > > is set to 0 but the entry block count is non-zero. That becomes > > > > a problem because we have this code in execute_fixup_cfg: > > > > > > > > profile_count num = node->count; > > > > profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count; > > > > bool scale = num.initialized_p () && !(num == den); > > > > > > > > Here if num is 0 but den is not 0, scale becomes true and we > > > > lose the counts in > > > > > > > > if (scale) > > > > bb->count = bb->count.apply_scale (num, den); > > > > > > > > This is what happened the issue reported in PR116743 > > > > (a 10% regression in MySQL HAMMERDB tests). > > > > 3d9e6767939e9658260e2506e81ec32b37cba041 made an improvement in > > > > AutoFDO count propagation, which caused the mismatch between > > > > the call graph node count (zero) and the entry block count (non-zero) > > > > and subsequent loss of counts as described above. > > > > > > > > The fix is to update the call graph node count once we've done count > > propagation. > > > > > > > > Tested on x86_64-pc-linux-gnu. > > OK. > > Thanks, > Richard. > > > > > > > gcc/ChangeLog: > > > > PR gcov-profile/116743 > > > > * auto-profile.c (afdo_annotate_cfg): Fix mismatch > > between the call graph node count > > > > and the entry block count. > > > > --- > > > > gcc/auto-profile.cc | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc > > > > index 5d0e8afb9a1..aa4d1634f01 100644 > > > > --- a/gcc/auto-profile.cc > > > > +++ b/gcc/auto-profile.cc > > > > @@ -1538,8 +1538,6 @@ afdo_annotate_cfg (const stmt_set > > &promoted_stmts) > > > > if (s == NULL) > > > > return; > > > > - cgraph_node::get (current_function_decl)->count > > > > - = profile_count::from_gcov_type (s->head_count ()).afdo (); > > > > ENTRY_BLOCK_PTR_FOR_FN (cfun)->count > > > > = profile_count::from_gcov_type (s->head_count ()).afdo (); > > > > EXIT_BLOCK_PTR_FOR_FN (cfun)->count = profile_count::zero ().afdo > > (); > > > > @@ -1578,6 +1576,8 @@ afdo_annotate_cfg (const stmt_set > > &promoted_stmts) > > > > /* Calculate, propagate count and probability information on > > CFG. */ > > > > afdo_calculate_branch_prob (&annotated_bb); > > > > } > > > > + cgraph_node::get(current_function_decl)->count > > > > + = ENTRY_BLOCK_PTR_FOR_FN(cfun)->count; > > > > update_max_bb_count (); > > > > profile_status_for_fn (cfun) = PROFILE_READ; > > > > if (flag_value_profile_transformations) > > > > -- > > > > 2.34.1 > > > >