On Thu, Jan 16, 2025 at 3:17 AM Eugene Rozenfeld
<eugene.rozenf...@microsoft.com> wrote:
>
> I committed the patch to trunk. Is it ok to backport to gcc-12, gcc-13, and 
> gcc-14?

Yes.

> -----Original Message-----
> From: Richard Biener <richard.guent...@gmail.com>
> Sent: Monday, January 13, 2025 11:22 PM
> To: Eugene Rozenfeld <eugene.rozenf...@microsoft.com>
> Cc: GCC-Patches-ML <gcc-patches@gcc.gnu.org>; Jan Hubicka <hubi...@ucw.cz>; 
> rvmal...@amazon.com
> Subject: [EXTERNAL] Re: [PATCH] Fix setting of call graph node AutoFDO count 
> [PR116743]
>
> On Mon, Jan 13, 2025 at 10:47 PM Eugene Rozenfeld 
> <eugene.rozenf...@microsoft.com> wrote:
> >
> > We are initializing both the call graph node count and
> >
> > the entry block count of the function with the head_count value
> >
> > from the profile.
> >
> >
> >
> > Count propagation algorithm may refine the entry block count
> >
> > and we may end up with a case where the call graph node count
> >
> > is set to 0 but the entry block count is non-zero. That becomes
> >
> > a problem because we have this code in execute_fixup_cfg:
> >
> >
> >
> > profile_count num = node->count;
> >
> > profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
> >
> > bool scale = num.initialized_p () && !(num == den);
> >
> >
> >
> > Here if num is 0 but den is not 0, scale becomes true and we
> >
> > lose the counts in
> >
> >
> >
> > if (scale)
> >
> >   bb->count = bb->count.apply_scale (num, den);
> >
> >
> >
> > This is what happened the issue reported in PR116743
> >
> > (a 10% regression in MySQL HAMMERDB tests).
> >
> > 3d9e6767939e9658260e2506e81ec32b37cba041 made an improvement in
> >
> > AutoFDO count propagation, which caused the mismatch between
> >
> > the call graph node count (zero) and the entry block count (non-zero)
> >
> > and subsequent loss of counts as described above.
> >
> >
> >
> > The fix is to update the call graph node count once we've done count 
> > propagation.
> >
> >
> >
> > Tested on x86_64-pc-linux-gnu.
>
> OK.
>
> Thanks,
> Richard.
>
> >
> >
> > gcc/ChangeLog:
> >
> >                 PR gcov-profile/116743
> >
> >                 * auto-profile.c (afdo_annotate_cfg): Fix mismatch
> > between the call graph node count
> >
> >                 and the entry block count.
> >
> > ---
> >
> > gcc/auto-profile.cc | 4 ++--
> >
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> >
> >
> > diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
> >
> > index 5d0e8afb9a1..aa4d1634f01 100644
> >
> > --- a/gcc/auto-profile.cc
> >
> > +++ b/gcc/auto-profile.cc
> >
> > @@ -1538,8 +1538,6 @@ afdo_annotate_cfg (const stmt_set
> > &promoted_stmts)
> >
> >    if (s == NULL)
> >
> >      return;
> >
> > -  cgraph_node::get (current_function_decl)->count
> >
> > -     = profile_count::from_gcov_type (s->head_count ()).afdo ();
> >
> >    ENTRY_BLOCK_PTR_FOR_FN (cfun)->count
> >
> >       = profile_count::from_gcov_type (s->head_count ()).afdo ();
> >
> >    EXIT_BLOCK_PTR_FOR_FN (cfun)->count = profile_count::zero ().afdo
> > ();
> >
> > @@ -1578,6 +1576,8 @@ afdo_annotate_cfg (const stmt_set
> > &promoted_stmts)
> >
> >        /* Calculate, propagate count and probability information on
> > CFG.  */
> >
> >        afdo_calculate_branch_prob (&annotated_bb);
> >
> >      }
> >
> > +  cgraph_node::get(current_function_decl)->count
> >
> > +      = ENTRY_BLOCK_PTR_FOR_FN(cfun)->count;
> >
> >    update_max_bb_count ();
> >
> >    profile_status_for_fn (cfun) = PROFILE_READ;
> >
> >    if (flag_value_profile_transformations)
> >
> > --
> >
> > 2.34.1
> >
> >

Reply via email to