https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743

--- Comment #29 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Eugene Rozenfeld
<ero...@gcc.gnu.org>:

https://gcc.gnu.org/g:d180e392d7a8ba1346bfe7580de953234f0c2f9d

commit r12-10915-gd180e392d7a8ba1346bfe7580de953234f0c2f9d
Author: Eugene Rozenfeld <ero...@microsoft.com>
Date:   Fri Jan 10 19:48:52 2025 -0800

    Fix setting of call graph node AutoFDO count

    We are initializing both the call graph node count and
    the entry block count of the function with the head_count value
    from the profile.

    Count propagation algorithm may refine the entry block count
    and we may end up with a case where the call graph node count
    is set to zero but the entry block count is non-zero. That becomes
    a problem because we have this code in execute_fixup_cfg:

     profile_count num = node->count;
     profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
     bool scale = num.initialized_p () && !(num == den);

    Here if num is 0 but den is not 0, scale becomes true and we
    lose the counts in

    if (scale)
      bb->count = bb->count.apply_scale (num, den);

    This is what happened in the issue reported in PR116743
    (a 10% regression in MySQL HAMMERDB tests).
    3d9e6767939e9658260e2506e81ec32b37cba041 made an improvement in
    AutoFDO count propagation, which caused a mismatch between
    the call graph node count (zero) and the entry block count (non-zero)
    and subsequent loss of counts as described above.

    The fix is to update the call graph node count once we've done count
propagation.

    Tested on x86_64-pc-linux-gnu.

    gcc/ChangeLog:
            PR gcov-profile/116743
            * auto-profile.cc (afdo_annotate_cfg): Fix mismatch between the
call graph node count
            and the entry block count.

    (cherry picked from commit e683c6b029f809c7a1981b4341c95d9652c22e18)

Reply via email to