On Thu, Nov 14, 2024 at 5:03 PM Jakub Jelinek <ja...@redhat.com> wrote:
>
> Hi!
>
> The inlining heuristics uses DECL_DECLARED_INLINE_P (whether a function
> has been explicitly marked inline; that can be inline keyword, or for C++
> also constexpr keyword or defining a function inside of a class definition)
> heavily to increase desirability of inlining a function etc.
> In most cases it is desirable, people usually mark functions inline with
> the intent that they are actually inlined.
> But as PR93008 shows, that isn't always the case.
> One can mark (usually large or cold) function constexpr just because the
> standard requires it to be constexpr or that it is useful to users to allow
> evaluating the function in constant expression evaluation, and doesn't mind
> if the compiler chooses to inline it if it is really worth it, but it might
> not be that good idea to do so.  Especially with recent versions of C++
> where pretty much everything has been or is going to be constexpr.
> Or one might e.g. use inline keyword to get the C++ comdat behavior, again
> with no particular intent that such function is a good idea to be inlined.
>
> This patch introduces a new attribute for weaker inline semantics (basically
> it behaves as inline for the FE/debug info purposes, just for the
> optimization decisions acts as if it wasn't explicitly inline); I haven't
> used weak_inline for the attribute name because one could confuse that with
> weak attribute and this has nothing to do with that.
>
> So far smoke tested on x86_64-linux, ok for trunk if it passes full
> bootstrap/regtest?

Sorry for chiming in only late - to me this shows that the desire to inline
a function more than another function, currently identified as
DECL_DECLARED_INLINE_P overlaps with frontend semantic differences.
But don't we reflect those semantic differences into the IL (via linkage,
symtab details) already?  So what would remain is a way for the user
to distinguish between auto-inline (we have the corresponding -auto
set of --params) and inline-inline.  The middle-end interface after your
change, where DECL_DECLARED_INLINE_P means inline-inline
unless !DECL_OPTIMIZABLE_INLINE_P looks a bit awkward.

Rather than clearing DECL_DECLARED_INLINE_P I'd suggest to
split both completely and turn DECL_DISREGARD_INLINE_LIMITS,
DECL_UNINLINABLE and auto-inline vs. inline-inline into a
multi-bit enum and only use that for inlining decisions (ignoring
DECL_DECLARED_INLINE_P for that purpose, but use that
and feeble_inline to compute the enum value).

Note I've had to lookup what 'feeble' means - given we use -auto
for --params I'd have chosen __attribute__((auto_inline)), possibly
"completed" by __attribute__((inline)) to mark a function as
wanting 'inline' heuristics but not 'inline' semantics.

Again, sorry for chiming in late.

Thanks,
Richard.

> 2024-11-14  Jakub Jelinek  <ja...@redhat.com>
>
>         PR c++/93008
> gcc/
>         * tree-core.h (struct tree_function_decl): Add feeble_inline_flag
>         bitfield.
>         * tree.h (DECL_FEEBLE_INLINE_P, DECL_OPTIMIZABLE_INLINE_P): Define.
>         * cgraphunit.cc (process_function_and_variable_attributes): Warn
>         on feeble_inline attribute on !DECL_DECLARED_INLINE_P function.
>         * symtab.cc (symtab_node::fixup_same_cpp_alias_visibility): Copy
>         over DECL_FEEBLE_INLINE_P as well.  Formatting fixes.
>         * tree-inline.cc (tree_inlinable_function_p, expand_call_inline): Use
>         DECL_OPTIMIZABLE_INLINE_P instead of DECL_DECLARED_INLINE_P.
>         * ipa-cp.cc (devirtualization_time_bonus): Likewise.
>         * ipa-fnsummary.cc (ipa_call_context::estimate_size_and_time):
>         Likewise.
>         * ipa-icf.cc (sem_item::compare_referenced_symbol_properties): Punt
>         on DECL_FEEBLE_INLINE_P differences.
>         (sem_item::hash_referenced_symbol_properties): Hash also
>         DECL_FEEBLE_INLINE_P and DECL_IS_REPLACEABLE_OPERATOR.
>         * ipa-inline.cc (can_inline_edge_by_limits_p,
>         want_early_inline_function_p, want_inline_small_function_p,
>         want_inline_self_recursive_call_p, wrapper_heuristics_may_apply,
>         edge_badness, recursive_inlining, early_inline_small_functions): Use
>         DECL_OPTIMIZABLE_INLINE_P instead of DECL_DECLARED_INLINE_P.
>         * ipa-split.cc (consider_split, execute_split_functions): Likewise.
>         * lto-streamer-out.cc (hash_tree): Hash DECL_FEEBLE_INLINE_P and
>         DECL_IS_REPLACEABLE_OPERATOR.
>         * tree-streamer-in.cc (unpack_ts_function_decl_value_fields): Unpack
>         DECL_FEEBLE_INLINE_P.
>         * tree-streamer-out.cc (pack_ts_function_decl_value_fields): Pack
>         DECL_FEEBLE_INLINE_P.
>         * doc/invoke.texi (Winline): Document feeble_inline functions aren't
>         warned about.
>         * doc/extend.texi (feeble_inline function attribute): Document.
> gcc/c-family/
>         * c-attribs.cc (attr_always_inline_exclusions,
>         attr_noinline_exclusions): Add feeble_inline.
>         (attr_feeble_inline_exclusions): New variable.
>         (c_common_gnu_attributes): Add feeble_inline attribute.
>         (handle_feeble_inline_attribute): New function.
> gcc/c/
>         * c-decl.cc (merge_decls): Merge DECL_FEEBLE_INLINE_P.
> gcc/cp/
>         * decl.cc (duplicate_decls): Merge DECL_FEEBLE_INLINE_P.
>         * method.cc (implicitly_declare_fn): Copy DECL_FEEBLE_INLINE_P.
>         * optimize.cc (maybe_clone_body): Likewise.
> gcc/testsuite/
>         * c-c++-common/attr-feeble_inline-1.c: New test.
>         * gcc.dg/attr-feeble_inline-1.c: New test.
>         * g++.dg/ext/attr-feeble_inline-1.C: New test.
>         * g++.dg/ext/attr-feeble_inline-2.C: New test.
>         * g++.dg/ext/attr-feeble_inline-2.cc: New test.
>
> --- gcc/tree-core.h.jj  2024-11-14 12:25:50.257322008 +0100
> +++ gcc/tree-core.h     2024-11-14 13:12:50.711972132 +0100
> @@ -2054,8 +2054,9 @@ struct GTY(()) tree_function_decl {
>    unsigned has_debug_args_flag : 1;
>    unsigned versioned_function : 1;
>    unsigned replaceable_operator : 1;
> +  unsigned feeble_inline_flag : 1;
>
> -  /* 11 bits left for future expansion.  */
> +  /* 10 bits left for future expansion.  */
>    /* 32 bits on 64-bit HW.  */
>  };
>
> --- gcc/tree.h.jj       2024-11-14 12:25:50.434319603 +0100
> +++ gcc/tree.h  2024-11-14 13:14:32.541537830 +0100
> @@ -3480,6 +3480,20 @@ set_function_decl_type (tree decl, funct
>  #define DECL_NO_INLINE_WARNING_P(NODE) \
>    (FUNCTION_DECL_CHECK (NODE)->function_decl.no_inline_warning_flag)
>
> +/* Nonzero in a FUNCTION_DECL means this function is has
> +   feeble_inline attribute, while it is DECL_DECLARED_INLINE_P,
> +   that declaration should be ignored for optimization purposes,
> +   it is declared inline only for other reasons (C++ constexpr
> +   so that it can be constant evaluated or get the C++ comdat behavior
> +   of inline functions.  */
> +#define DECL_FEEBLE_INLINE_P(NODE) \
> +  (FUNCTION_DECL_CHECK (NODE)->function_decl.feeble_inline_flag)
> +
> +/* Nonzero if inlining should prefer inlining this function.
> +   Shorthand for DECL_DECLARED_INLINE_P && !DECL_FEEBLE_INLINE_P.  */
> +#define DECL_OPTIMIZABLE_INLINE_P(NODE) \
> +  (DECL_DECLARED_INLINE_P (NODE) && !DECL_FEEBLE_INLINE_P (NODE))
> +
>  /* Nonzero if a FUNCTION_CODE is a TM load/store.  */
>  #define BUILTIN_TM_LOAD_STORE_P(FN) \
>    ((FN) >= BUILT_IN_TM_STORE_1 && (FN) <= BUILT_IN_TM_LOAD_RFW_LDOUBLE)
> --- gcc/cgraphunit.cc.jj        2024-10-25 10:00:29.331769817 +0200
> +++ gcc/cgraphunit.cc   2024-11-14 16:25:49.645165038 +0100
> @@ -924,6 +924,13 @@ process_function_and_variable_attributes
>                     "%<always_inline%> function might not be inlinable"
>                     " unless also declared %<inline%>");
>
> +      if (!DECL_DECLARED_INLINE_P (decl)
> +         && DECL_FEEBLE_INLINE_P (decl)
> +         && lookup_attribute ("feeble_inline", DECL_ATTRIBUTES (decl)))
> +       warning_at (DECL_SOURCE_LOCATION (decl), OPT_Wattributes,
> +                   "%<feeble_inline%> attribute ignored on a function"
> +                   " not declared %<inline%>");
> +
>        process_common_attributes (node, decl);
>      }
>    for (vnode = symtab->first_variable (); vnode != first_var;
> --- gcc/symtab.cc.jj    2024-10-25 10:00:29.523767070 +0200
> +++ gcc/symtab.cc       2024-11-14 13:56:48.257678809 +0100
> @@ -1704,9 +1704,11 @@ symtab_node::fixup_same_cpp_alias_visibi
>    if (is_a <cgraph_node *> (this))
>      {
>        DECL_DECLARED_INLINE_P (decl)
> -        = DECL_DECLARED_INLINE_P (target->decl);
> +       = DECL_DECLARED_INLINE_P (target->decl);
>        DECL_DISREGARD_INLINE_LIMITS (decl)
> -        = DECL_DISREGARD_INLINE_LIMITS (target->decl);
> +       = DECL_DISREGARD_INLINE_LIMITS (target->decl);
> +      DECL_FEEBLE_INLINE_P (decl)
> +       = DECL_FEEBLE_INLINE_P (target->decl);
>      }
>    /* FIXME: It is not really clear why those flags should not be copied for
>       functions, too.  */
> --- gcc/tree-inline.cc.jj       2024-11-08 13:35:45.858585408 +0100
> +++ gcc/tree-inline.cc  2024-11-14 14:00:07.663847156 +0100
> @@ -4167,7 +4167,7 @@ tree_inlinable_function_p (tree fn)
>
>    /* We only warn for functions declared `inline' by the user.  */
>    do_warning = (opt_for_fn (fn, warn_inline)
> -               && DECL_DECLARED_INLINE_P (fn)
> +               && DECL_OPTIMIZABLE_INLINE_P (fn)
>                 && !DECL_NO_INLINE_WARNING_P (fn)
>                 && !DECL_IN_SYSTEM_HEADER (fn));
>
> @@ -4876,7 +4876,7 @@ expand_call_inline (basic_block bb, gimp
>                     "called from this function");
>         }
>        else if (opt_for_fn (fn, warn_inline)
> -              && DECL_DECLARED_INLINE_P (fn)
> +              && DECL_OPTIMIZABLE_INLINE_P (fn)
>                && !DECL_NO_INLINE_WARNING_P (fn)
>                && !DECL_IN_SYSTEM_HEADER (fn)
>                && reason != CIF_UNSPECIFIED
> --- gcc/ipa-cp.cc.jj    2024-10-24 18:53:39.507069334 +0200
> +++ gcc/ipa-cp.cc       2024-11-14 14:13:59.952057411 +0100
> @@ -3291,7 +3291,7 @@ devirtualization_time_bonus (struct cgra
>        else if (size <= max_inline_insns_auto / 2)
>         res += 15 / ((int)speculative + 1);
>        else if (size <= max_inline_insns_auto
> -              || DECL_DECLARED_INLINE_P (callee->decl))
> +              || DECL_OPTIMIZABLE_INLINE_P (callee->decl))
>         res += 7 / ((int)speculative + 1);
>      }
>
> --- gcc/ipa-fnsummary.cc.jj     2024-10-25 10:00:29.478767714 +0200
> +++ gcc/ipa-fnsummary.cc        2024-11-14 14:11:54.147832203 +0100
> @@ -3961,10 +3961,10 @@ ipa_call_context::estimate_size_and_time
>      {
>        if (info->scc_no)
>         hints |= INLINE_HINT_in_scc;
> -      if (DECL_DECLARED_INLINE_P (m_node->decl))
> +      if (DECL_OPTIMIZABLE_INLINE_P (m_node->decl))
>         hints |= INLINE_HINT_declared_inline;
>        if (info->builtin_constant_p_parms.length ()
> -         && DECL_DECLARED_INLINE_P (m_node->decl))
> +         && DECL_OPTIMIZABLE_INLINE_P (m_node->decl))
>         hints |= INLINE_HINT_builtin_constant_p;
>
>        ipa_freqcounting_predicate *fcp;
> --- gcc/ipa-icf.cc.jj   2024-10-25 10:00:29.478767714 +0200
> +++ gcc/ipa-icf.cc      2024-11-14 14:10:53.263691126 +0100
> @@ -357,6 +357,10 @@ sem_item::compare_referenced_symbol_prop
>           if (DECL_DECLARED_INLINE_P (n1->decl)
>               != DECL_DECLARED_INLINE_P (n2->decl))
>             return return_false_with_msg ("inline attributes are different");
> +
> +         if (DECL_FEEBLE_INLINE_P (n1->decl)
> +             != DECL_FEEBLE_INLINE_P (n2->decl))
> +           return return_false_with_msg ("feeble_inline attributes are 
> different");
>         }
>
>        if (DECL_IS_OPERATOR_NEW_P (n1->decl)
> @@ -427,8 +431,10 @@ sem_item::hash_referenced_symbol_propert
>         {
>           hstate.add_flag (DECL_DISREGARD_INLINE_LIMITS (ref->decl));
>           hstate.add_flag (DECL_DECLARED_INLINE_P (ref->decl));
> +         hstate.add_flag (DECL_FEEBLE_INLINE_P (ref->decl));
>         }
>        hstate.add_flag (DECL_IS_OPERATOR_NEW_P (ref->decl));
> +      hstate.add_flag (DECL_IS_REPLACEABLE_OPERATOR (ref->decl));
>      }
>    else if (is_a <varpool_node *> (ref))
>      {
> --- gcc/ipa-inline.cc.jj        2024-10-25 10:00:29.479767699 +0200
> +++ gcc/ipa-inline.cc   2024-11-14 14:21:41.009553029 +0100
> @@ -649,7 +649,7 @@ can_inline_edge_by_limits_p (struct cgra
>         {
>           int growth = estimate_edge_growth (e);
>           if (growth > opt_for_fn (caller->decl, param_max_inline_insns_size)
> -             && (!DECL_DECLARED_INLINE_P (callee->decl)
> +             && (!DECL_OPTIMIZABLE_INLINE_P (callee->decl)
>                   && growth >= MAX (inline_insns_single (caller, false, 
> false),
>                                     inline_insns_auto (caller, false, 
> false))))
>             {
> @@ -789,7 +789,7 @@ want_early_inline_function_p (struct cgr
>         * the cloned callee has enough samples to be considered "hot".  */
>    else if (flag_auto_profile && afdo_callsite_hot_enough_for_early_inline 
> (e))
>      ;
> -  else if (!DECL_DECLARED_INLINE_P (callee->decl)
> +  else if (!DECL_OPTIMIZABLE_INLINE_P (callee->decl)
>            && !opt_for_fn (e->caller->decl, flag_inline_small_functions))
>      {
>        e->inline_failed = CIF_FUNCTION_NOT_INLINE_CANDIDATE;
> @@ -968,7 +968,7 @@ want_inline_small_function_p (struct cgr
>      want_inline = false;
>    else if (DECL_DISREGARD_INLINE_LIMITS (callee->decl))
>      ;
> -  else if (!DECL_DECLARED_INLINE_P (callee->decl)
> +  else if (!DECL_OPTIMIZABLE_INLINE_P (callee->decl)
>            && !opt_for_fn (e->caller->decl, flag_inline_small_functions))
>      {
>        e->inline_failed = CIF_FUNCTION_NOT_INLINE_CANDIDATE;
> @@ -976,7 +976,7 @@ want_inline_small_function_p (struct cgr
>      }
>    /* Do fast and conservative check if the function can be good
>       inline candidate.  */
> -  else if ((!DECL_DECLARED_INLINE_P (callee->decl)
> +  else if ((!DECL_OPTIMIZABLE_INLINE_P (callee->decl)
>            && (!e->count.ipa ().initialized_p () || !e->maybe_hot_p ()))
>            && ipa_fn_summaries->get (callee)->min_size
>                 - ipa_call_summaries->get (e)->call_stmt_size
> @@ -985,13 +985,13 @@ want_inline_small_function_p (struct cgr
>        e->inline_failed = CIF_MAX_INLINE_INSNS_AUTO_LIMIT;
>        want_inline = false;
>      }
> -  else if ((DECL_DECLARED_INLINE_P (callee->decl)
> +  else if ((DECL_OPTIMIZABLE_INLINE_P (callee->decl)
>             || e->count.ipa ().nonzero_p ())
>            && ipa_fn_summaries->get (callee)->min_size
>                 - ipa_call_summaries->get (e)->call_stmt_size
>               > inline_insns_single (e->caller, true, true))
>      {
> -      e->inline_failed = (DECL_DECLARED_INLINE_P (callee->decl)
> +      e->inline_failed = (DECL_OPTIMIZABLE_INLINE_P (callee->decl)
>                           ? CIF_MAX_INLINE_INSNS_SINGLE_LIMIT
>                           : CIF_MAX_INLINE_INSNS_AUTO_LIMIT);
>        want_inline = false;
> @@ -1016,7 +1016,7 @@ want_inline_small_function_p (struct cgr
>          hints suggests that inlining given function is very profitable.
>          Avoid computation of big_speedup_p when not necessary to change
>          outcome of decision.  */
> -      else if (DECL_DECLARED_INLINE_P (callee->decl)
> +      else if (DECL_OPTIMIZABLE_INLINE_P (callee->decl)
>                && growth >= inline_insns_single (e->caller, apply_hints,
>                                                  apply_hints2)
>                && (apply_hints || apply_hints2
> @@ -1027,7 +1027,7 @@ want_inline_small_function_p (struct cgr
>            e->inline_failed = CIF_MAX_INLINE_INSNS_SINGLE_LIMIT;
>           want_inline = false;
>         }
> -      else if (!DECL_DECLARED_INLINE_P (callee->decl)
> +      else if (!DECL_OPTIMIZABLE_INLINE_P (callee->decl)
>                && !opt_for_fn (e->caller->decl, flag_inline_functions)
>                && growth >= opt_for_fn (to->decl,
>                                         param_max_inline_insns_small))
> @@ -1042,7 +1042,7 @@ want_inline_small_function_p (struct cgr
>         }
>        /* Apply param_max_inline_insns_auto limit for functions not declared
>          inline.  Bypass the limit when speedup seems big.  */
> -      else if (!DECL_DECLARED_INLINE_P (callee->decl)
> +      else if (!DECL_OPTIMIZABLE_INLINE_P (callee->decl)
>                && growth >= inline_insns_auto (e->caller, apply_hints,
>                                                apply_hints2)
>                && (apply_hints || apply_hints2
> @@ -1095,7 +1095,7 @@ want_inline_self_recursive_call_p (struc
>    int max_depth = opt_for_fn (outer_node->decl,
>                               param_max_inline_recursive_depth_auto);
>
> -  if (DECL_DECLARED_INLINE_P (edge->caller->decl))
> +  if (DECL_OPTIMIZABLE_INLINE_P (edge->caller->decl))
>      max_depth = opt_for_fn (outer_node->decl,
>                             param_max_inline_recursive_depth);
>
> @@ -1258,7 +1258,7 @@ want_inline_function_to_all_callers_p (s
>  static bool
>  wrapper_heuristics_may_apply (struct cgraph_node *where, int size)
>  {
> -  return size < (DECL_DECLARED_INLINE_P (where->decl)
> +  return size < (DECL_OPTIMIZABLE_INLINE_P (where->decl)
>                  ? inline_insns_single (where, false, false)
>                  : inline_insns_auto (where, false, false));
>  }
> @@ -1376,8 +1376,8 @@ edge_badness (struct cgraph_edge *edge,
>           /* ... and edges executed only conditionally ... */
>           && freq < 1
>           /* ... consider case where callee is not inline but caller is ... */
> -         && ((!DECL_DECLARED_INLINE_P (edge->callee->decl)
> -              && DECL_DECLARED_INLINE_P (caller->decl))
> +         && ((!DECL_OPTIMIZABLE_INLINE_P (edge->callee->decl)
> +              && DECL_OPTIMIZABLE_INLINE_P (caller->decl))
>               /* ... or when early optimizers decided to split and edge
>                  frequency still indicates splitting is a win ... */
>               || (callee->split_part && !caller->split_part
> @@ -1385,8 +1385,8 @@ edge_badness (struct cgraph_edge *edge,
>                          < opt_for_fn (caller->decl,
>                                        
> param_partial_inlining_entry_probability)
>                   /* ... and do not overwrite user specified hints.   */
> -                 && (!DECL_DECLARED_INLINE_P (edge->callee->decl)
> -                     || DECL_DECLARED_INLINE_P (caller->decl)))))
> +                 && (!DECL_OPTIMIZABLE_INLINE_P (edge->callee->decl)
> +                     || DECL_OPTIMIZABLE_INLINE_P (caller->decl)))))
>         {
>           ipa_fn_summary *caller_info = ipa_fn_summaries->get (caller);
>           int caller_growth = caller_info->growth;
> @@ -1767,7 +1767,7 @@ recursive_inlining (struct cgraph_edge *
>    if (node->inlined_to)
>      node = node->inlined_to;
>
> -  if (DECL_DECLARED_INLINE_P (node->decl))
> +  if (DECL_OPTIMIZABLE_INLINE_P (node->decl))
>      limit = opt_for_fn (to->decl, param_max_inline_insns_recursive);
>
>    /* Make sure that function is small enough to be considered for inlining.  
> */
> @@ -3042,7 +3042,7 @@ early_inline_small_functions (struct cgr
>         continue;
>
>        /* Do not consider functions not declared inline.  */
> -      if (!DECL_DECLARED_INLINE_P (callee->decl)
> +      if (!DECL_OPTIMIZABLE_INLINE_P (callee->decl)
>           && !opt_for_fn (node->decl, flag_inline_small_functions)
>           && !opt_for_fn (node->decl, flag_inline_functions))
>         continue;
> --- gcc/ipa-split.cc.jj 2024-10-25 10:00:29.481767671 +0200
> +++ gcc/ipa-split.cc    2024-11-14 14:06:38.188301534 +0100
> @@ -566,7 +566,7 @@ consider_split (class split_point *curre
>       inline predicates to reduce function body size.  We add 10 to anticipate
>       that.  Next stage1 we should try to be more meaningful here.  */
>    if (current->header_size + call_overhead
> -      >= (unsigned int)(DECL_DECLARED_INLINE_P (current_function_decl)
> +      >= (unsigned int)(DECL_OPTIMIZABLE_INLINE_P (current_function_decl)
>                         ? param_max_inline_insns_single
>                         : param_max_inline_insns_auto) + 10)
>      {
> @@ -1794,7 +1794,7 @@ execute_split_functions (void)
>
>    /* FIXME: We can actually split if splitting reduces call overhead.  */
>    if (!flag_inline_small_functions
> -      && !DECL_DECLARED_INLINE_P (current_function_decl))
> +      && !DECL_OPTIMIZABLE_INLINE_P (current_function_decl))
>      {
>        if (dump_file)
>         fprintf (dump_file, "Not splitting: not autoinlining and function"
> --- gcc/lto-streamer-out.cc.jj  2024-11-06 18:53:10.834843821 +0100
> +++ gcc/lto-streamer-out.cc     2024-11-14 13:54:40.896487395 +0100
> @@ -1348,6 +1348,8 @@ hash_tree (struct streamer_tree_cache_d
>        hstate.add_flag (DECL_DISREGARD_INLINE_LIMITS (t));
>        hstate.add_flag (DECL_PURE_P (t));
>        hstate.add_flag (DECL_LOOPING_CONST_OR_PURE_P (t));
> +      hstate.add_flag (DECL_FEEBLE_INLINE_P (t));
> +      hstate.add_flag (DECL_IS_REPLACEABLE_OPERATOR (t));
>        hstate.commit_flag ();
>        if (DECL_BUILT_IN_CLASS (t) != NOT_BUILT_IN)
>         hstate.add_int (DECL_UNCHECKED_FUNCTION_CODE (t));
> --- gcc/tree-streamer-in.cc.jj  2024-10-24 18:53:41.747037959 +0200
> +++ gcc/tree-streamer-in.cc     2024-11-14 13:53:36.818397336 +0100
> @@ -338,6 +338,7 @@ unpack_ts_function_decl_value_fields (st
>    DECL_DISREGARD_INLINE_LIMITS (expr) = (unsigned) bp_unpack_value (bp, 1);
>    DECL_PURE_P (expr) = (unsigned) bp_unpack_value (bp, 1);
>    DECL_LOOPING_CONST_OR_PURE_P (expr) = (unsigned) bp_unpack_value (bp, 1);
> +  DECL_FEEBLE_INLINE_P (expr) = (unsigned) bp_unpack_value (bp, 1);
>    DECL_IS_REPLACEABLE_OPERATOR (expr) = (unsigned) bp_unpack_value (bp, 1);
>    unsigned int fcode = 0;
>    if (cl != NOT_BUILT_IN)
> --- gcc/tree-streamer-out.cc.jj 2024-10-24 18:53:41.748037944 +0200
> +++ gcc/tree-streamer-out.cc    2024-11-14 13:53:53.996153404 +0100
> @@ -317,6 +317,7 @@ pack_ts_function_decl_value_fields (stru
>    bp_pack_value (bp, DECL_DISREGARD_INLINE_LIMITS (expr), 1);
>    bp_pack_value (bp, DECL_PURE_P (expr), 1);
>    bp_pack_value (bp, DECL_LOOPING_CONST_OR_PURE_P (expr), 1);
> +  bp_pack_value (bp, DECL_FEEBLE_INLINE_P (expr), 1);
>    bp_pack_value (bp, DECL_IS_REPLACEABLE_OPERATOR (expr), 1);
>    if (DECL_BUILT_IN_CLASS (expr) != NOT_BUILT_IN)
>      bp_pack_value (bp, DECL_UNCHECKED_FUNCTION_CODE (expr), 32);
> --- gcc/doc/invoke.texi.jj      2024-11-14 12:26:21.329899779 +0100
> +++ gcc/doc/invoke.texi 2024-11-14 15:29:52.048685977 +0100
> @@ -10445,7 +10445,8 @@ Warn if an @code{extern} declaration is
>  @opindex Winline
>  @opindex Wno-inline
>  @item -Winline
> -Warn if a function that is declared as inline cannot be inlined.
> +Warn if a function that is declared as inline cannot be inlined, unless
> +it is declared with the @code{feeble_inline} function attribute.
>  Even with this option, the compiler does not warn about failures to
>  inline functions declared in system headers.
>
> --- gcc/doc/extend.texi.jj      2024-11-14 12:25:49.769328639 +0100
> +++ gcc/doc/extend.texi 2024-11-14 14:51:44.263046285 +0100
> @@ -2875,6 +2875,30 @@ Note that if such a function is called i
>  or may not inline it depending on optimization level and a failure
>  to inline an indirect call may or may not be diagnosed.
>
> +@cindex @code{feeble_inline} function attribute
> +@item feeble_inline
> +GCC generally considers functions declared inline (with @code{inline}
> +keyword, or in C++ with @code{constexpr} keyword, or member functions
> +defined in class definition) as more desirable to inline over similar
> +functions not declared inline.  For most of such functions such heuristics
> +results in better code generation.
> +The @code{feeble_inline} function attribute allows to say a particular
> +function declared inline should not be considered different for optimization
> +purposes from similar functions not declared inline.
> +This is useful if a function is declared inline for other properties
> +of inline functions than optimizations and in particular the actual
> +inlining.  E.g.@: a very large function can be declared @code{constexpr}
> +in C++ just so that it can be evaluated in constant expressions, but it
> +is too large or handles less important corner cases to be worth inlining.
> +Or such function is declared @code{inline} to get the C++ comdat behavior
> +of inline functions.  Or it is an @code{extern} @code{gnu_inline} inline
> +function, which can use external definition if it is not inlined, and has
> +body which can be inlined if the compiler deems it inlining worth, but
> +the compiler shouldn't try hard to inline it.
> +The behavior of functions with this attribute is somewhere between
> +@code{always_inline}, which are inlined always and @code{noinline}, which
> +are never inlined.  @code{feeble_inline} can be inlined if it is worth it.
> +
>  @cindex @code{artificial} function attribute
>  @item artificial
>  This attribute is useful for small inline wrappers that if possible
> --- gcc/c-family/c-attribs.cc.jj        2024-11-14 12:25:49.238335855 +0100
> +++ gcc/c-family/c-attribs.cc   2024-11-14 16:26:16.908779266 +0100
> @@ -82,6 +82,7 @@ static tree handle_leaf_attribute (tree
>  static tree handle_always_inline_attribute (tree *, tree, tree, int,
>                                             bool *);
>  static tree handle_gnu_inline_attribute (tree *, tree, tree, int, bool *);
> +static tree handle_feeble_inline_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_artificial_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_flatten_attribute (tree *, tree, tree, int, bool *);
>  static tree handle_error_attribute (tree *, tree, tree, int, bool *);
> @@ -225,6 +226,7 @@ static const struct attribute_spec::excl
>  static const struct attribute_spec::exclusions 
> attr_always_inline_exclusions[] =
>  {
>    ATTR_EXCL ("noinline", true, true, true),
> +  ATTR_EXCL ("feeble_inline", true, true, true),
>    ATTR_EXCL ("target_clones", true, true, true),
>    ATTR_EXCL (NULL, false, false, false),
>  };
> @@ -233,6 +235,14 @@ static const struct attribute_spec::excl
>  {
>    ATTR_EXCL ("always_inline", true, true, true),
>    ATTR_EXCL ("gnu_inline", true, true, true),
> +  ATTR_EXCL ("feeble_inline", true, true, true),
> +  ATTR_EXCL (NULL, false, false, false),
> +};
> +
> +static const struct attribute_spec::exclusions 
> attr_feeble_inline_exclusions[] =
> +{
> +  ATTR_EXCL ("always_inline", true, true, true),
> +  ATTR_EXCL ("noinline", true, true, true),
>    ATTR_EXCL (NULL, false, false, false),
>  };
>
> @@ -379,6 +389,9 @@ const struct attribute_spec c_common_gnu
>    { "gnu_inline",             0, 0, true,  false, false, false,
>                               handle_gnu_inline_attribute,
>                               attr_inline_exclusions },
> +  { "feeble_inline",          0, 0, true,  false, false, false,
> +                             handle_feeble_inline_attribute,
> +                             attr_feeble_inline_exclusions },
>    { "artificial",             0, 0, true,  false, false, false,
>                               handle_artificial_attribute, NULL },
>    { "flatten",                0, 0, true,  false, false, false,
> @@ -1791,6 +1804,26 @@ handle_gnu_inline_attribute (tree *node,
>    else
>      {
>        warning (OPT_Wattributes, "%qE attribute ignored", name);
> +      *no_add_attrs = true;
> +    }
> +
> +  return NULL_TREE;
> +}
> +
> +/* Handle a "feeble_inline" attribute; arguments as in
> +   struct attribute_spec.handler.  */
> +
> +static tree
> +handle_feeble_inline_attribute (tree *node, tree name,
> +                               tree ARG_UNUSED (args),
> +                               int ARG_UNUSED (flags),
> +                               bool *no_add_attrs)
> +{
> +  if (TREE_CODE (*node) == FUNCTION_DECL)
> +    DECL_FEEBLE_INLINE_P (*node) = 1;
> +  else
> +    {
> +      warning (OPT_Wattributes, "%qE attribute ignored", name);
>        *no_add_attrs = true;
>      }
>
> --- gcc/c/c-decl.cc.jj  2024-11-14 12:25:49.614330745 +0100
> +++ gcc/c/c-decl.cc     2024-11-14 13:19:37.647235028 +0100
> @@ -2992,6 +2992,10 @@ merge_decls (tree newdecl, tree olddecl,
>             = DECL_DISREGARD_INLINE_LIMITS (olddecl)
>             = (DECL_DISREGARD_INLINE_LIMITS (newdecl)
>                || DECL_DISREGARD_INLINE_LIMITS (olddecl));
> +
> +         DECL_FEEBLE_INLINE_P (newdecl) = DECL_FEEBLE_INLINE_P (olddecl)
> +           = (DECL_FEEBLE_INLINE_P (newdecl)
> +              || DECL_FEEBLE_INLINE_P (olddecl));
>         }
>
>        if (fndecl_built_in_p (olddecl))
> --- gcc/cp/decl.cc.jj   2024-11-05 08:58:25.147845688 +0100
> +++ gcc/cp/decl.cc      2024-11-14 13:40:46.518305364 +0100
> @@ -2515,6 +2515,8 @@ duplicate_decls (tree newdecl, tree oldd
>                 = DECL_DECLARED_INLINE_P (new_result);
>               DECL_DISREGARD_INLINE_LIMITS (old_result)
>                 |= DECL_DISREGARD_INLINE_LIMITS (new_result);
> +             DECL_FEEBLE_INLINE_P (old_result)
> +               |= DECL_FEEBLE_INLINE_P (new_result);
>             }
>           else
>             {
> @@ -2522,6 +2524,8 @@ duplicate_decls (tree newdecl, tree oldd
>                 |= DECL_DECLARED_INLINE_P (new_result);
>               DECL_DISREGARD_INLINE_LIMITS (old_result)
>                 |= DECL_DISREGARD_INLINE_LIMITS (new_result);
> +             DECL_FEEBLE_INLINE_P (old_result)
> +               |= DECL_FEEBLE_INLINE_P (new_result);
>               check_redeclaration_exception_specification (newdecl, olddecl);
>
>               merge_attribute_bits (new_result, old_result);
> @@ -2956,6 +2960,9 @@ duplicate_decls (tree newdecl, tree oldd
>           DECL_DISREGARD_INLINE_LIMITS (olddecl)
>             = DECL_DISREGARD_INLINE_LIMITS (newdecl);
>
> +         DECL_FEEBLE_INLINE_P (olddecl)
> +           = DECL_FEEBLE_INLINE_P (newdecl);
> +
>           DECL_UNINLINABLE (olddecl) = DECL_UNINLINABLE (newdecl);
>         }
>        else if (new_defines_function && DECL_INITIAL (olddecl))
> @@ -2988,6 +2995,10 @@ duplicate_decls (tree newdecl, tree oldd
>             = DECL_DISREGARD_INLINE_LIMITS (olddecl)
>             = (DECL_DISREGARD_INLINE_LIMITS (newdecl)
>                || DECL_DISREGARD_INLINE_LIMITS (olddecl));
> +         DECL_FEEBLE_INLINE_P (newdecl)
> +           = DECL_FEEBLE_INLINE_P (olddecl)
> +           = (DECL_FEEBLE_INLINE_P (newdecl)
> +              || DECL_FEEBLE_INLINE_P (olddecl));
>         }
>
>        /* Preserve abstractness on cloned [cd]tors.  */
> --- gcc/cp/method.cc.jj 2024-10-24 18:53:38.456084055 +0200
> +++ gcc/cp/method.cc    2024-11-14 13:41:17.877861551 +0100
> @@ -3455,6 +3455,8 @@ implicitly_declare_fn (special_function_
>        DECL_ATTRIBUTES (fn) = clone_attrs (DECL_ATTRIBUTES 
> (inherited_ctor_fn));
>        DECL_DISREGARD_INLINE_LIMITS (fn)
>         = DECL_DISREGARD_INLINE_LIMITS (inherited_ctor_fn);
> +      DECL_FEEBLE_INLINE_P (fn)
> +       = DECL_FEEBLE_INLINE_P (inherited_ctor_fn);
>      }
>
>    /* Add the "this" parameter.  */
> --- gcc/cp/optimize.cc.jj       2024-10-25 10:00:29.420768544 +0200
> +++ gcc/cp/optimize.cc  2024-11-14 13:50:51.303746232 +0100
> @@ -534,6 +534,7 @@ maybe_clone_body (tree fn)
>        DECL_DLLIMPORT_P (clone) = DECL_DLLIMPORT_P (fn);
>        DECL_ATTRIBUTES (clone) = clone_attrs (DECL_ATTRIBUTES (fn));
>        DECL_DISREGARD_INLINE_LIMITS (clone) = DECL_DISREGARD_INLINE_LIMITS 
> (fn);
> +      DECL_FEEBLE_INLINE_P (clone) = DECL_FEEBLE_INLINE_P (fn);
>        set_decl_section_name (clone, fn);
>
>        /* Adjust the parameter names and locations.  */
> --- gcc/testsuite/c-c++-common/attr-feeble_inline-1.c.jj        2024-11-14 
> 16:28:39.522761025 +0100
> +++ gcc/testsuite/c-c++-common/attr-feeble_inline-1.c   2024-11-14 
> 16:35:55.125596393 +0100
> @@ -0,0 +1,10 @@
> +/* PR c++/93008 */
> +/* { dg-do compile } */
> +
> +__attribute__((feeble_inline)) int f1 (void) { return 0; }                   
>           /* { dg-warning "'feeble_inline' attribute ignored on a function 
> not declared 'inline'" } */
> +__attribute__((feeble_inline (0))) static inline int f2 (void) { return 0; } 
>           /* { dg-error "wrong number of arguments specified for 
> 'feeble_inline' attribute" } */
> +__attribute__((feeble_inline, always_inline)) static inline int f3 (void) { 
> return 0; }        /* { dg-warning "ignoring attribute 'always_inline' 
> because it conflicts with attribute 'feeble_inline'" } */
> +__attribute__((always_inline, feeble_inline)) static inline int f4 (void) { 
> return 0; }        /* { dg-warning "ignoring attribute 'feeble_inline' 
> because it conflicts with attribute 'always_inline'" } */
> +__attribute__((feeble_inline, noinline)) static int f5 (void) { return 0; }  
>           /* { dg-warning "ignoring attribute 'noinline' because it conflicts 
> with attribute 'feeble_inline'" } */
> +                                                                             
>           /* { dg-warning "'feeble_inline' attribute ignored on a function 
> not declared 'inline'" "" { target *-*-* } .-1 } */
> +__attribute__((noinline, feeble_inline)) static int f6 (void) { return 0; }  
>           /* { dg-warning "ignoring attribute 'feeble_inline' because it 
> conflicts with attribute 'noinline'" } */
> --- gcc/testsuite/gcc.dg/attr-feeble_inline-1.c.jj      2024-11-14 
> 15:28:38.580725154 +0100
> +++ gcc/testsuite/gcc.dg/attr-feeble_inline-1.c 2024-11-14 15:36:25.275123964 
> +0100
> @@ -0,0 +1,50 @@
> +/* PR c++/93008 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-tree-vectorize -fdump-tree-optimized -Winline" } */
> +/* { dg-final { scan-tree-dump-times " = foo \\\(" 3 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times " = bar \\\(" 3 "optimized" } } */
> +/* { dg-final { scan-tree-dump-not " = corge \\\(" "optimized" } } */
> +
> +static inline __attribute__((feeble_inline)) int
> +foo (int *x)
> +{
> +  int r = 0;
> +#define A(n) r += x[n];
> +#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) 
> A(n##8) A(n##9)
> +#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) 
> B(n##8) B(n##9)
> +#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) 
> C(n##8) C(n##9)
> +  B(1) B(2) B(3) B(4)
> +  return r;
> +}
> +
> +static inline __attribute__((feeble_inline)) int
> +bar (int *x)
> +{
> +  int r = 0;
> +  C(1) C(2) C(3) C(4)
> +  return r;
> +}
> +
> +int
> +baz (int *x, int *y, int *z)
> +{
> +  return foo (x) + foo (y) + foo (z);
> +}
> +
> +int
> +qux (int *x, int *y, int *z)
> +{
> +  return bar (x) + bar (y) + bar (z);
> +}
> +
> +static inline __attribute__((feeble_inline)) int
> +corge (int x)
> +{
> +  return x;
> +}
> +
> +int
> +freddy (void)
> +{
> +  return corge (0);
> +}
> --- gcc/testsuite/g++.dg/ext/attr-feeble_inline-1.C.jj  2024-11-14 
> 15:34:27.435790745 +0100
> +++ gcc/testsuite/g++.dg/ext/attr-feeble_inline-1.C     2024-11-14 
> 15:41:52.227499393 +0100
> @@ -0,0 +1,71 @@
> +// PR c++/93008
> +// { dg-do compile { target c++14 } }
> +// { dg-options "-O2 -fno-tree-vectorize -fdump-tree-optimized -Winline" }
> +// { dg-final { scan-tree-dump-times " = foo \\\(" 3 "optimized" } }
> +// { dg-final { scan-tree-dump-times " = bar \\\(" 3 "optimized" } }
> +// { dg-final { scan-tree-dump-not " = corge \\\(" "optimized" } }
> +
> +[[gnu::feeble_inline]] constexpr int
> +foo (int *x)
> +{
> +  int r = 0;
> +#define A(n) r += x[n];
> +#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) 
> A(n##8) A(n##9)
> +#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) 
> B(n##8) B(n##9)
> +#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C(n##6) C(n##7) 
> C(n##8) C(n##9)
> +  B(1) B(2) B(3) B(4)
> +  return r;
> +}
> +
> +[[gnu::feeble_inline]] constexpr int
> +bar (int *x)
> +{
> +  int r = 0;
> +  C(1) C(2) C(3) C(4)
> +  return r;
> +}
> +
> +int
> +baz (int *x, int *y, int *z)
> +{
> +  return foo (x) + foo (y) + foo (z);
> +}
> +
> +int
> +qux (int *x, int *y, int *z)
> +{
> +  return bar (x) + bar (y) + bar (z);
> +}
> +
> +[[gnu::feeble_inline]] constexpr int
> +corge (int x)
> +{
> +  return x;
> +}
> +
> +int
> +freddy (void)
> +{
> +  return corge (0);
> +}
> +
> +constexpr int
> +garply ()
> +{
> +  int a[140] = {};
> +  a[10] = 42;
> +  return foo (a);
> +}
> +
> +constexpr int
> +waldo ()
> +{
> +  int a[1400] = {};
> +  a[100] = -42;
> +  return bar (a);
> +}
> +
> +static_assert (garply () == 42, "");
> +static_assert (waldo () == -42, "");
> +static_assert (corge (0) == 0, "");
> +static_assert (corge (42) == 42, "");
> --- gcc/testsuite/g++.dg/ext/attr-feeble_inline-2.C.jj  2024-11-14 
> 15:42:59.043554319 +0100
> +++ gcc/testsuite/g++.dg/ext/attr-feeble_inline-2.C     2024-11-14 
> 15:49:25.675085636 +0100
> @@ -0,0 +1,27 @@
> +// PR c++/93008
> +// { dg-do run { target c++11 } }
> +// { dg-additional-sources "attr-feeble_inline-2.cc" }
> +
> +[[gnu::feeble_inline]] inline int &
> +foo ()
> +{
> +  static int a;
> +  return a;
> +}
> +
> +struct S {
> +  [[gnu::feeble_inline]] int &bar () { static int a; return a; }
> +};
> +
> +extern void bar (int *&, int *&);
> +
> +int
> +main ()
> +{
> +  int &a = foo ();
> +  int &b = S{}.bar ();
> +  int *c, *d;
> +  bar (c, d);
> +  if (&a != c || &a == &b || &b != d)
> +    __builtin_abort ();
> +}
> --- gcc/testsuite/g++.dg/ext/attr-feeble_inline-2.cc.jj 2024-11-14 
> 15:56:10.945347610 +0100
> +++ gcc/testsuite/g++.dg/ext/attr-feeble_inline-2.cc    2024-11-14 
> 15:57:27.855257932 +0100
> @@ -0,0 +1,19 @@
> +// PR c++/93008
> +
> +[[gnu::feeble_inline]] inline int &
> +foo ()
> +{
> +  static int a;
> +  return a;
> +}
> +
> +struct S {
> +  [[gnu::feeble_inline]] int &bar () { static int a; return a; }
> +};
> +
> +void
> +bar (int *&x, int *&y)
> +{
> +  x = &foo ();
> +  y = &S{}.bar ();
> +}
>
>         Jakub
>

Reply via email to