Hi, since this is inlining-related, I would somewhat prefer Honza to have a look too, but I have the following comments:
On Wed, Sep 08 2021, Kewen.Lin wrote: > [...] > diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h > index 78399b0b9bb..300b8da4507 100644 > --- a/gcc/ipa-fnsummary.h > +++ b/gcc/ipa-fnsummary.h > @@ -193,6 +194,9 @@ public: > vec<ipa_freqcounting_predicate, va_gc> *loop_strides; > /* Parameters tested by builtin_constant_p. */ > vec<int, va_heap, vl_ptr> GTY((skip)) builtin_constant_p_parms; > + /* Like fp_expressions, but it's to hold some target specific information, > + such as some target specific isa flags. */ > + auto_vec<HOST_WIDE_INT> GTY((skip)) target_info; > /* Estimated growth for inlining all copies of the function before start > of small functions inlining. > This value will get out of date as the callers are duplicated, but Segher already wrote in the first thread that a vector of HOST_WIDE_INTs is an overkill and I agree. So at least make the new field just a HOST_WIDE_INT or better yet, an unsigned int. But I would even go further and make target_info only a 16-bit bit-field, place it after the other bit-fields in class ipa_fn_summary and pass it to the hooks as uint16_t. Unless you have plans which require more space, I think we should be conservative here. I am also not sure if I agree that the field should not be streamed for offloading, but since we do not have an offloading compiler needing them I guess for now that is OK. But it should be documented in the comment describing the field that it is not streamed to offloading compilers. [...] > diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c > index 2470937460f..72091b6193f 100644 > --- a/gcc/ipa-fnsummary.c > +++ b/gcc/ipa-fnsummary.c > @@ -2608,6 +2617,7 @@ analyze_function_body (struct cgraph_node *node, bool > early) > info->conds = NULL; > info->size_time_table.release (); > info->call_size_time_table.release (); > + info->target_info.release(); > > /* When optimizing and analyzing for IPA inliner, initialize loop optimizer > so we can produce proper inline hints. > @@ -2659,6 +2669,12 @@ analyze_function_body (struct cgraph_node *node, bool > early) > bb_predicate, > bb_predicate); > > + /* Only look for target information for inlinable functions. */ > + bool scan_for_target_info = > + info->inlinable > + && targetm.target_option.need_ipa_fn_target_info (node->decl, > + info->target_info); > + > if (fbi.info) > compute_bb_predicates (&fbi, node, info, params_summary); > const profile_count entry_count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count; > @@ -2876,6 +2892,10 @@ analyze_function_body (struct cgraph_node *node, bool > early) > if (dump_file) > fprintf (dump_file, " fp_expression set\n"); > } > + if (scan_for_target_info) > + scan_for_target_info = > + targetm.target_option.update_ipa_fn_target_info > + (info->target_info, stmt); > } Practically it probably does not matter, but why is this in the "if (this_time || this_size)" block? Although I can see that setting fp_expression is also done that way... but it seems like copying a mistake to me. All that said, the overall approach seems correct to me. Martin