Hi Martin,

on 2021/9/16 下午9:19, Martin Jambor wrote:
> Hi,
> 
> On Thu, Sep 16 2021, Kewen.Lin wrote:
>> Hi Martin,
>>
>> Thanks for the review comments!
>>
>> on 2021/9/15 下午8:51, Martin Jambor wrote:
>>> Hi,
>>>
>>> since this is inlining-related, I would somewhat prefer Honza to have a
>>> look too, but I have the following comments:
>>>
>>> On Wed, Sep 08 2021, Kewen.Lin wrote:
>>>>
>>>
>>> [...]
>>>
>>>> diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h
>>>> index 78399b0b9bb..300b8da4507 100644
>>>> --- a/gcc/ipa-fnsummary.h
>>>> +++ b/gcc/ipa-fnsummary.h
>>>> @@ -193,6 +194,9 @@ public:
>>>>    vec<ipa_freqcounting_predicate, va_gc> *loop_strides;
>>>>    /* Parameters tested by builtin_constant_p.  */
>>>>    vec<int, va_heap, vl_ptr> GTY((skip)) builtin_constant_p_parms;
>>>> +  /* Like fp_expressions, but it's to hold some target specific 
>>>> information,
>>>> +     such as some target specific isa flags.  */
>>>> +  auto_vec<HOST_WIDE_INT> GTY((skip)) target_info;
>>>>    /* Estimated growth for inlining all copies of the function before start
>>>>       of small functions inlining.
>>>>       This value will get out of date as the callers are duplicated, but
>>>
>>> Segher already wrote in the first thread that a vector of HOST_WIDE_INTs
>>> is an overkill and I agree.  So at least make the new field just a
>>> HOST_WIDE_INT or better yet, an unsigned int.  But I would even go
>>> further and make target_info only a 16-bit bit-field, place it after the
>>> other bit-fields in class ipa_fn_summary and pass it to the hooks as
>>> uint16_t.  Unless you have plans which require more space, I think we
>>> should be conservative here.
>>>
>>
>> OK, yeah, the consideration is mainly for the scenario that target has
>> a few bits to care about.  I just realized that to avoid inefficient
>> bitwise operation for mapping target info bits to isa_flag bits, target
>> can rearrange the sparse bits in isa_flag, so it's not a deal.
>> Thanks for re-raising this!  I'll use the 16 bits bit-field in v3 as you
>> suggested, if you don't mind, I will put it before the existing bit-fields
>> to have a good alignment.
> 
> All right.
> 

Sorry that I failed to use 16 bit-fields for this, I figured out that
the bit-fields can not be address-taken or passed as non-const reference.
The gentype also failed to recognize uint16_t if I used uint16_t directly
in ipa-fnsummary.h.  Finally I used unsigned int instead.


>>>>    /* When optimizing and analyzing for IPA inliner, initialize loop 
>>>> optimizer
>>>>       so we can produce proper inline hints.
>>>> @@ -2659,6 +2669,12 @@ analyze_function_body (struct cgraph_node *node, 
>>>> bool early)
>>>>                       bb_predicate,
>>>>                       bb_predicate);
>>>>  
>>>> +  /* Only look for target information for inlinable functions.  */
>>>> +  bool scan_for_target_info =
>>>> +    info->inlinable
>>>> +    && targetm.target_option.need_ipa_fn_target_info (node->decl,
>>>> +                                                info->target_info);
>>>> +
>>>>    if (fbi.info)
>>>>      compute_bb_predicates (&fbi, node, info, params_summary);
>>>>    const profile_count entry_count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
>>>> @@ -2876,6 +2892,10 @@ analyze_function_body (struct cgraph_node *node, 
>>>> bool early)
>>>>              if (dump_file)
>>>>                fprintf (dump_file, "   fp_expression set\n");
>>>>            }
>>>> +        if (scan_for_target_info)
>>>> +          scan_for_target_info =
>>>> +            targetm.target_option.update_ipa_fn_target_info
>>>> +            (info->target_info, stmt);
>>>>        }
>>>
>>> Practically it probably does not matter, but why is this in the "if
>>> (this_time || this_size)" block?  Although I can see that setting
>>> fp_expression is also done that way... but it seems like copying a
>>> mistake to me.
>>
>> Yeah, I felt target info scanning is similar to fp_expression scanning,
>> so I just followed the same way.  If I read it right, the case
>> !(this_time || this_size) means the STMT won't be weighted to any RTL
>> insn from both time and size perspectives, so guarding it seems to avoid
>> unnecessary scannings.  I assumed that target bifs and inline asm would
>> not be evaluated as zero cost, it seems safe so far for HTM usage.
>>
>> Do you worry about some special STMT which is weighted to zero but it's
>> necessarily to be checked for target info in a long term?
>> If so, I'll move it out in v3.
> 
> It seems that gimple_call_internal_p statements are always costed to
> zero and I am wondering whether those are something that targets would
> want to look out for in the future.
> 
> But hopefully anyone implementing that in the future would come up with
> a testcase and would need to fix this to have the testcase pass.
> 

Thanks for confirming, I guess targets are very likely to have the need
to scan the IFNs in future.  I've moved it out of the block in V3.
Thanks for noticing this!

V3: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579658.html

BR,
Kewen

Reply via email to