Hi Martin, on 2021/9/16 下午9:19, Martin Jambor wrote: > Hi, > > On Thu, Sep 16 2021, Kewen.Lin wrote: >> Hi Martin, >> >> Thanks for the review comments! >> >> on 2021/9/15 下午8:51, Martin Jambor wrote: >>> Hi, >>> >>> since this is inlining-related, I would somewhat prefer Honza to have a >>> look too, but I have the following comments: >>> >>> On Wed, Sep 08 2021, Kewen.Lin wrote: >>>> >>> >>> [...] >>> >>>> diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h >>>> index 78399b0b9bb..300b8da4507 100644 >>>> --- a/gcc/ipa-fnsummary.h >>>> +++ b/gcc/ipa-fnsummary.h >>>> @@ -193,6 +194,9 @@ public: >>>> vec<ipa_freqcounting_predicate, va_gc> *loop_strides; >>>> /* Parameters tested by builtin_constant_p. */ >>>> vec<int, va_heap, vl_ptr> GTY((skip)) builtin_constant_p_parms; >>>> + /* Like fp_expressions, but it's to hold some target specific >>>> information, >>>> + such as some target specific isa flags. */ >>>> + auto_vec<HOST_WIDE_INT> GTY((skip)) target_info; >>>> /* Estimated growth for inlining all copies of the function before start >>>> of small functions inlining. >>>> This value will get out of date as the callers are duplicated, but >>> >>> Segher already wrote in the first thread that a vector of HOST_WIDE_INTs >>> is an overkill and I agree. So at least make the new field just a >>> HOST_WIDE_INT or better yet, an unsigned int. But I would even go >>> further and make target_info only a 16-bit bit-field, place it after the >>> other bit-fields in class ipa_fn_summary and pass it to the hooks as >>> uint16_t. Unless you have plans which require more space, I think we >>> should be conservative here. >>> >> >> OK, yeah, the consideration is mainly for the scenario that target has >> a few bits to care about. I just realized that to avoid inefficient >> bitwise operation for mapping target info bits to isa_flag bits, target >> can rearrange the sparse bits in isa_flag, so it's not a deal. >> Thanks for re-raising this! I'll use the 16 bits bit-field in v3 as you >> suggested, if you don't mind, I will put it before the existing bit-fields >> to have a good alignment. > > All right. >
Sorry that I failed to use 16 bit-fields for this, I figured out that the bit-fields can not be address-taken or passed as non-const reference. The gentype also failed to recognize uint16_t if I used uint16_t directly in ipa-fnsummary.h. Finally I used unsigned int instead. >>>> /* When optimizing and analyzing for IPA inliner, initialize loop >>>> optimizer >>>> so we can produce proper inline hints. >>>> @@ -2659,6 +2669,12 @@ analyze_function_body (struct cgraph_node *node, >>>> bool early) >>>> bb_predicate, >>>> bb_predicate); >>>> >>>> + /* Only look for target information for inlinable functions. */ >>>> + bool scan_for_target_info = >>>> + info->inlinable >>>> + && targetm.target_option.need_ipa_fn_target_info (node->decl, >>>> + info->target_info); >>>> + >>>> if (fbi.info) >>>> compute_bb_predicates (&fbi, node, info, params_summary); >>>> const profile_count entry_count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count; >>>> @@ -2876,6 +2892,10 @@ analyze_function_body (struct cgraph_node *node, >>>> bool early) >>>> if (dump_file) >>>> fprintf (dump_file, " fp_expression set\n"); >>>> } >>>> + if (scan_for_target_info) >>>> + scan_for_target_info = >>>> + targetm.target_option.update_ipa_fn_target_info >>>> + (info->target_info, stmt); >>>> } >>> >>> Practically it probably does not matter, but why is this in the "if >>> (this_time || this_size)" block? Although I can see that setting >>> fp_expression is also done that way... but it seems like copying a >>> mistake to me. >> >> Yeah, I felt target info scanning is similar to fp_expression scanning, >> so I just followed the same way. If I read it right, the case >> !(this_time || this_size) means the STMT won't be weighted to any RTL >> insn from both time and size perspectives, so guarding it seems to avoid >> unnecessary scannings. I assumed that target bifs and inline asm would >> not be evaluated as zero cost, it seems safe so far for HTM usage. >> >> Do you worry about some special STMT which is weighted to zero but it's >> necessarily to be checked for target info in a long term? >> If so, I'll move it out in v3. > > It seems that gimple_call_internal_p statements are always costed to > zero and I am wondering whether those are something that targets would > want to look out for in the future. > > But hopefully anyone implementing that in the future would come up with > a testcase and would need to fix this to have the testcase pass. > Thanks for confirming, I guess targets are very likely to have the need to scan the IFNs in future. I've moved it out of the block in V3. Thanks for noticing this! V3: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579658.html BR, Kewen