On Thu, 16 Mar 2023, Jakub Jelinek wrote:
> On Thu, Mar 16, 2023 at 12:05:56PM +0000, Richard Biener wrote:
> > On Thu, 16 Mar 2023, Jakub Jelinek wrote:
> >
> > > On Fri, Nov 25, 2022 at 09:26:34PM +0100, Richard Biener via Gcc-patches
> > > wrote:
> > > > > We could
> > > > > probably keep tract if any instrumented code was ever inlined into a
> > > > > given function and perhaps just start ignoring attributes set on
> > > > > types?
> > > >
> > > > But ignoring attributes on types makes all indirect calls not
> > > > const/pure annotatable (OK, they are not pure annotatable because of
> > > > the above bug). I really don't see how to conservatively solve this
> > > > issue?
> > >
> > > > Maybe we can ignore all pure/const when the cgraph state is
> > > > in profile-instrumented state? Of course we have multiple "APIs"
> > > > to query that.
> > >
> > > I think that's the way to go. But we'd need to arrange somewhere during
> > > IPA
> > > to add vops to all those pure/const calls if -fprofile-generate (direct or
> > > indirect; not sure what exact flags) and after that make sure all our APIs
> > > don't return ECF_CONST, ECF_PURE or ECF_LOOPING_CONST_OR_PURE.
> > > Could be e.g.
> > > if (whatever)
> > > flags &= ~(ECF_CONST | ECF_PURE | ECF_LOOPING_CONST_OR_PURE);
> > > at the end of flags_from_decl_or_type, internal_fn_flags, what else?
> > > Although, perhaps internal_fn_flags don't need to change, because internal
> > > calls don't really have callees.
> > >
> > > Or is just ECF_CONST enough given that ECF_PURE is on fndecls and not
> > > types?
> > >
> > > Or perhaps just start ignoring TYPE_READONLY -> ECF_CONST in
> > > flags_from_decl_or_type if that flag is on?
> > >
> > > Is that rewriting currently what tree_profiling does in the
> > > /* Update call statements and rebuild the cgraph. */
> > > FOR_EACH_DEFINED_FUNCTION (node)
> > > spot where it calls update_stmt on all call statements?
> > >
> > > If so, could we just set that global? flag instead or in addition to
> > > doing those node->set_const_flag (false, false); calls and
> > > change flags_from_decl_or_type, plus I guess lto1 should set that
> > > flag if it is global on start as well if
> > > !flag_auto_profile
> > > && (flag_branch_probabilities || flag_test_coverage || profile_arc_flag)
> > > ?
> > >
> > > I think just ignoring ECF_CONST from TYPE_READONLY might be best, if we
> > > have external functions like builtins from libc/libm and don't have their
> > > bodies, we can still treat them as const, the only problem is with the
> > > indirect calls where we don't know if we do or don't have a body for
> > > the callees and whether we've instrumented those or not.
> >
> > I think we want something reflected on the IL. Because of LTO I think
> > we cannot ignore externs (or we have to do massaging at stream-in).
> >
> > The following brute-force works for the testcase. I suppose since
> > we leave const/pure set on functions without body in the compile-time
> > TU (and ignore LTO there) we could do the same here.
> >
> > I also wonder if that whole function walk is necessary if not
> > profile_arc_flag || flag_test_coverage ...
> >
> > Honza - does this look OK to you? I'll test guarding the whole
> > outer loop with profile_arc_flag || flag_test_coverage separately.
> >
> > diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
> > index ea9d7a23443..c8789618f96 100644
> > --- a/gcc/tree-profile.cc
> > +++ b/gcc/tree-profile.cc
> > @@ -840,9 +840,29 @@ tree_profiling (void)
> > gimple_stmt_iterator gsi;
> > for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> > {
> > - gimple *stmt = gsi_stmt (gsi);
> > - if (is_gimple_call (stmt))
> > - update_stmt (stmt);
> > + gcall *call = dyn_cast <gcall *> (gsi_stmt (gsi));
> > + if (!call)
> > + continue;
> > +
> > + if (profile_arc_flag || flag_test_coverage)
> > + {
> > + /* We do not clear pure/const on decls without body. */
> > + tree fndecl = gimple_call_fndecl (call);
> > + if (fndecl && !gimple_has_body_p (fndecl))
> > + continue;
> > +
> > + /* Drop the const attribute from the call type (the pure
> > + attribute is not available on types). */
> > + tree fntype = gimple_call_fntype (call);
> > + if (fntype && TYPE_READONLY (fntype))
> > + gimple_call_set_fntype (call, build_qualified_type
> > + (fntype, (TYPE_QUALS
> > (fntype)
> > + &
> > ~TYPE_QUAL_CONST)));
> > + }
> > +
> > + /* Update virtual operands of calls to no longer const/pure
> > + functions. */
> > + update_stmt (call);
> > }
> > }
>
> I have in the meantime briefly tested following.
>
> But if you want to the above way, then at least the testcase could be
> useful. Though, not sure if the above is all that is needed. Shouldn't
> set_const_flag_1 upon TREE_READONLY (node->decl) = 0; also adjust
> TREE_TYPE on the function to
> tree fntype = TREE_TYPE (node->decl);
> TREE_TYPE (node->decl)
> = build_qualified_type (fntype,
> TYPE_QUALS (fntype) & ~TYPE_QUAL_CONST);
> too (perhaps guarded on TREE_READONLY (fntype))?
That would be unnecessary for the problem at hand I think. Nothing
should look at this type.
Let's wait for Honzas opinion.
Richard.
>
> 2023-03-16 Jakub Jelinek <[email protected]>
>
> PR tree-optimization/106912
> gcc/
> * calls.h (ignore_const_fntype): Declare.
> * calls.cc (flags_from_decl_or_type): Ignore TYPE_READONLY
> on types if ignore_const_fntype.
> * tree-profile.cc: Include calls.h.
> (tree_profiling): Set ignore_const_fntype if profile_arc_flag
> or flag_test_coverage.
> gcc/lto/
> * lto-lang.cc (lto_post_options): Set ignore_const_fntype if
> profile_arc_flag or flag_test_coverage and not flag_auto_profile.
> gcc/testsuite/
> * gcc.dg/pr106912.c: New test.
>
> --- gcc/calls.h.jj 2023-01-02 09:32:51.252868185 +0100
> +++ gcc/calls.h 2023-03-16 12:23:51.632460586 +0100
> @@ -134,5 +134,6 @@ extern void maybe_complain_about_tail_ca
>
> extern rtx rtx_for_static_chain (const_tree, bool);
> extern bool cxx17_empty_base_field_p (const_tree);
> +extern bool ignore_const_fntype;
>
> #endif // GCC_CALLS_H
> --- gcc/calls.cc.jj 2023-02-21 11:44:48.460464845 +0100
> +++ gcc/calls.cc 2023-03-16 12:27:45.427032110 +0100
> @@ -800,6 +800,13 @@ is_tm_builtin (const_tree fndecl)
> return false;
> }
>
> +/* Set if flags_from_decl_or_type should ignore TYPE_READONLY of function
> + types. This is used when tree-profile.cc instruments const calls,
> + clears TREE_READONLY on FUNCTION_DECLs which have been instrumented, but
> + for function types or indirect calls we don't know if the callees have
> been
> + instrumented or not. */
> +bool ignore_const_fntype;
> +
> /* Detect flags (function attributes) from the function decl or type node.
> */
>
> int
> @@ -849,7 +856,7 @@ flags_from_decl_or_type (const_tree exp)
> }
> else if (TYPE_P (exp))
> {
> - if (TYPE_READONLY (exp))
> + if (TYPE_READONLY (exp) && !ignore_const_fntype)
> flags |= ECF_CONST;
>
> if (flag_tm
> --- gcc/tree-profile.cc.jj 2023-01-02 09:32:27.923205268 +0100
> +++ gcc/tree-profile.cc 2023-03-16 12:57:29.130873893 +0100
> @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.
> #include "alloc-pool.h"
> #include "symbol-summary.h"
> #include "symtab-thunks.h"
> +#include "calls.h"
>
> static GTY(()) tree gcov_type_node;
> static GTY(()) tree tree_interval_profiler_fn;
> @@ -819,6 +820,10 @@ tree_profiling (void)
> node->set_pure_flag (false, false);
> }
>
> + /* From this point on, ignore TYPE_READONLY on function types. */
> + if (profile_arc_flag || flag_test_coverage)
> + ignore_const_fntype = true;
> +
> /* Update call statements and rebuild the cgraph. */
> FOR_EACH_DEFINED_FUNCTION (node)
> {
> --- gcc/lto/lto-lang.cc.jj 2023-01-16 11:52:16.165732856 +0100
> +++ gcc/lto/lto-lang.cc 2023-03-16 12:58:00.420414840 +0100
> @@ -37,6 +37,7 @@ along with GCC; see the file COPYING3.
> #include "lto-common.h"
> #include "stringpool.h"
> #include "attribs.h"
> +#include "calls.h"
>
> /* LTO specific dumps. */
> int lto_link_dump_id, decl_merge_dump_id, partition_dump_id;
> @@ -938,6 +939,11 @@ lto_post_options (const char **pfilename
> if (!flag_merge_constants)
> flag_merge_constants = 1;
>
> + /* If const functions were or might have been instrumented by
> + tree-profiler, ignore TYPE_READONLY on function types. */
> + if (!flag_auto_profile && (profile_arc_flag || flag_test_coverage))
> + ignore_const_fntype = true;
> +
> /* Initialize the compiler back end. */
> return false;
> }
> --- gcc/testsuite/gcc.dg/pr106912.c.jj 2023-03-16 13:02:17.720639903
> +0100
> +++ gcc/testsuite/gcc.dg/pr106912.c 2023-03-16 13:03:47.472323118 +0100
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/106912 */
> +/* { dg-do compile { target vect_simd_clones } } */
> +/* { dg-require-profiling "-fprofile-generate" } */
> +/* { dg-options "-O2 -ftree-vectorize -fprofile-generate" } */
> +/* { dg-additional-options "-fPIC" { target fpic } } */
> +
> +__attribute__ ((__simd__, __nothrow__, __leaf__, __const__))
> +double foo (double x);
> +
> +void
> +bar (double *f, int n)
> +{
> + int i;
> + for (i = 0; i < n; i++)
> + f[i] = foo (f[i]);
> +}
> +
> +double
> +foo (double x)
> +{
> + return x * x / 3.0;
> +}
>
>
> Jakub
>
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)