https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118508

            Bug ID: 118508
           Summary: 10% performance drop when enabling autofdo for
                    spec2017 554.roms_r
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: gcov-profile
          Assignee: unassigned at gcc dot gnu.org
          Reporter: liuhongt at gcc dot gnu.org
  Target Milestone: ---

with -march=x86-64-v3 -O2.

part of dump_gcov is like

__step3d_t_mod_MOD_step3d_t total:5500129 head:0
  0: 0
  29: 0
  30: 0
  31: 0
  32: 0
  36: 0
  37: 0
  38: 0
  39: 0
  46: 0
  59: 0
  60: 0
  62: 0
  59: step3d_t_tile total:5500129
    4: 0
    4.2: 0
    4.4: 0
    4.6: 0
    4.8: 0
    5: 0
    5.2: 0
    5.4: 0
    5.6: 0
    5.8: 0
    5.10: 0
    5.12: 0
    7: 0
    7.2: 0
    7.4: 1
    8: 0
    8.2: 0

step3d_t_tile is local and only called by step3d_t. Autofdo will do early
inline  if the edge in the call graph is hot, and it will check total count
from the callsite. Unfortranately, the string name it used is
DECL_ASSEMBLER_NAME (edge->callee->decl)) which is
__step3d_t_mod_MOD_step3d_t_tile, but corresponding name in afdo string table
is step3d_t_tile(w/o prefix, I guess it's from debug string table). The
mismatch cause auto lost profiling info for step3d_t_tile and thought it was
cold and optimized for size.

A hack like below can recover performance and further improve 554.roms_r by 3%
with autofdo

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index 3f890e6d1e6..ae8dd9bfdaf 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -837,8 +837,10 @@ autofdo_source_profile::get_callsite_total_count (

   function_instance *s = get_function_instance_by_inline_stack (stack);
   if (s == NULL
-      || afdo_string_table->get_index (IDENTIFIER_POINTER (
-             DECL_ASSEMBLER_NAME (edge->callee->decl))) != s->name ())
+      || (afdo_string_table->get_index (IDENTIFIER_POINTER (
+           DECL_ASSEMBLER_NAME (edge->callee->decl))) != s->name ()
+         && afdo_string_table->get_index_by_decl (edge->callee->decl)
+         != s->name()))
     return 0;

   return s->total_count ();

Reply via email to