> > The following is what I profile-bootstrapped and tested on > x86_64-unknown-linux-gnu. > > Richard. > > From d438a0d84cafced85c90204cba81de0f60ad0073 Mon Sep 17 00:00:00 2001 > From: Richard Biener <rguent...@suse.de> > Date: Thu, 16 Mar 2023 13:51:19 +0100 > Subject: [PATCH] tree-optimization/106912 - clear const attribute from fntype > To: gcc-patches@gcc.gnu.org > > The following makes sure that after clearing pure/const from > instrumented function declarations we are adjusting call statements > fntype as well to handle indirect calls and because gimple_call_flags > looks at both decl and fntype. > > Like the pure/const flag clearing on decls we refrain from touching > calls to known functions that do not have a body in the current TU. > > PR tree-optimization/106912 > * tree-profile.cc (tree_profiling): Update stmts only when > profiling or testing coverage. Make sure to update calls > fntype, stripping 'const' there. > > * gcc.dg/profile-generate-4.c: New testcase. > --- > gcc/testsuite/gcc.dg/profile-generate-4.c | 19 ++++++++++++ > gcc/tree-profile.cc | 38 +++++++++++++++++------ > 2 files changed, 47 insertions(+), 10 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/profile-generate-4.c > > diff --git a/gcc/testsuite/gcc.dg/profile-generate-4.c > b/gcc/testsuite/gcc.dg/profile-generate-4.c > new file mode 100644 > index 00000000000..c2b999fe4cb > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/profile-generate-4.c > @@ -0,0 +1,19 @@ > +/* PR106912 */ > +/* { dg-require-profiling "-fprofile-generate" } */ > +/* { dg-options "-O2 -fprofile-generate -ftree-vectorize" } */ > + > +__attribute__ ((__simd__)) > +__attribute__ ((__nothrow__ , __leaf__ , __const__, __noinline__)) > +double foo (double x); > + > +void bar(double *f, int n) > +{ > + int i; > + for (i = 0; i < n; i++) > + f[i] = foo(f[i]); > +} > + > +double foo(double x) > +{ > + return x * x / 3.0; > +} > diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc > index ea9d7a23443..7854cd4bc31 100644 > --- a/gcc/tree-profile.cc > +++ b/gcc/tree-profile.cc > @@ -835,16 +835,34 @@ tree_profiling (void) > > push_cfun (DECL_STRUCT_FUNCTION (node->decl)); > > - FOR_EACH_BB_FN (bb, cfun) > - { > - gimple_stmt_iterator gsi; > - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > - { > - gimple *stmt = gsi_stmt (gsi); > - if (is_gimple_call (stmt)) > - update_stmt (stmt); > - } > - } > + if (profile_arc_flag || flag_test_coverage) > + FOR_EACH_BB_FN (bb, cfun) > + { > + gimple_stmt_iterator gsi; > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gcall *call = dyn_cast <gcall *> (gsi_stmt (gsi)); > + if (!call) > + continue; > + > + /* We do not clear pure/const on decls without body. */ > + tree fndecl = gimple_call_fndecl (call); > + if (fndecl && !gimple_has_body_p (fndecl)) > + continue; > + > + /* Drop the const attribute from the call type (the pure > + attribute is not available on types). */ > + tree fntype = gimple_call_fntype (call); > + if (fntype && TYPE_READONLY (fntype)) > + gimple_call_set_fntype > + (call, build_qualified_type (fntype, (TYPE_QUALS (fntype) > + & ~TYPE_QUAL_CONST)));
Sorry, now I am bit confused on why Jakub's fix did not need similar fixup. The flag is only set during the profiling stage and thus I would expect it to still run into the problem that VOPs are missing. Is it only becuase we do not sanity check? Here is a testcase that shows the problem: include <math.h> float c; __attribute__ ((const)) float instrument(float v) { return sin (v); } __attribute__ ((no_profile_instrument_function,const,noinline)) float noinstrument (float v) { return instrument (v); } m() { c+=instrument(c); if (!__builtin_expect (c,1)) { c+=noinstrument (c); } c+=instrument(c); } main() { m(); } Compiling gcc -O0 t.c -fprofile-arcs -fno-early-inlining --coverage -lm -ftest-coverage -S ; gcc t.s -ftest-coverage -lm -fprofile-arcs makes gcov to report 3 executions on instrument while with -O3 it is 2. This is because the noinstrument has const flag that is not cleared and it becoames essentially non-const because it calls instrument that gets intrumented and partially inlined. Honza > + > + /* Update virtual operands of calls to no longer const/pure > + functions. */ > + update_stmt (call); > + } > + } > > /* re-merge split blocks. */ > cleanup_tree_cfg (); > -- > 2.35.3 >