https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #29 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The releases/gcc-12 branch has been updated by Eugene Rozenfeld <ero...@gcc.gnu.org>: https://gcc.gnu.org/g:d180e392d7a8ba1346bfe7580de953234f0c2f9d commit r12-10915-gd180e392d7a8ba1346bfe7580de953234f0c2f9d Author: Eugene Rozenfeld <ero...@microsoft.com> Date: Fri Jan 10 19:48:52 2025 -0800 Fix setting of call graph node AutoFDO count We are initializing both the call graph node count and the entry block count of the function with the head_count value from the profile. Count propagation algorithm may refine the entry block count and we may end up with a case where the call graph node count is set to zero but the entry block count is non-zero. That becomes a problem because we have this code in execute_fixup_cfg: profile_count num = node->count; profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count; bool scale = num.initialized_p () && !(num == den); Here if num is 0 but den is not 0, scale becomes true and we lose the counts in if (scale) bb->count = bb->count.apply_scale (num, den); This is what happened in the issue reported in PR116743 (a 10% regression in MySQL HAMMERDB tests). 3d9e6767939e9658260e2506e81ec32b37cba041 made an improvement in AutoFDO count propagation, which caused a mismatch between the call graph node count (zero) and the entry block count (non-zero) and subsequent loss of counts as described above. The fix is to update the call graph node count once we've done count propagation. Tested on x86_64-pc-linux-gnu. gcc/ChangeLog: PR gcov-profile/116743 * auto-profile.cc (afdo_annotate_cfg): Fix mismatch between the call graph node count and the entry block count. (cherry picked from commit e683c6b029f809c7a1981b4341c95d9652c22e18)