> Am 26.10.2023 um 16:58 schrieb Qing Zhao <qing.z...@oracle.com>:
>
>
>
>> On Oct 26, 2023, at 4:56 AM, Richard Biener <richard.guent...@gmail.com>
>> wrote:
>>
>>> On Thu, Oct 26, 2023 at 7:22 AM Jakub Jelinek <ja...@redhat.com> wrote:
>>>
>>> On Wed, Oct 25, 2023 at 07:03:43PM +0000, Qing Zhao wrote:
>>>> For the code generation impact:
>>>>
>>>> turning the original x.buf
>>>> to a builtin function call
>>>> __builtin_with_access_and_size(x,buf, x.L,-1)
>>>>
>>>> might inhibit some optimizations from happening before the builtin is
>>>> evaluated into object size info (phase .objsz1). I guess there might be
>>>> some performance impact.
>>>>
>>>> However, if we mark this builtin as PURE, NOTRROW, etc, then the negative
>>>> performance impact will be reduced to minimum?
>>>
>>> You can't drop it during objsz1 pass though, otherwise __bdos wouldn't
>>> be able to figure out the dynamic sizes in case of normal (non-early)
>>> inlining - caller takes address of a counted_by array, passes it down
>>> to callee which is only inlined late and uses __bdos, or callee takes
>>> address
>>> and returns it and caller uses __bdos, etc. - so it would need to be objsz2.
>>>
>>> And while the builtin (or if it is an internal detail rather than user
>>> accessible builtin an internal function) could be even const/nothrow/leaf if
>>> the arguments contain the loads from the structure 2 fields, I'm afraid it
>>> will still have huge code generation impact, prevent tons of pre-IPA
>>> optimizations. And it will need some work to handle it properly during
>>> inlining heuristics, because in GIMPLE the COMPONENT_REF loads aren't gimple
>>> values, so it wouldn't be just the builtin/internal-fn call to be ignored,
>>> but also the count load from memory.
>>
>> I think we want to track the value, not the "memory" in the builtin call,
>> so GIMPLE would be
>>
>> _1 = x.L;
>> .. = __builtin_with_access_and_size (&x.buf, _1, -1);
>
> Before adding the __builtin_with_access_and_size, the code is:
>
> &x.buf
>
> After inserting the built-in, it becomes:
>
> _1 = x.L;
> __builtin_with_access_and_size (&x.buf, _1, -1).
>
>
> So, the # of total instructions, the # of LOADs, and the # of calls will all
> be increased.
> There will be impact to the inlining decision definitely.
Note we have to make sure, if x is a pointer and we want to instrument &x->buf
that we
Can dereference x. Possibly doing
_1 = x ? x->Len : -1;
I’m not sure the C standard makes accessing x->Len unconditionally not
undefined behavior when &x->buf is computed. Definitely it’s a violation of
the abstract machine of Len is volatile qualified (but we can reject such
counted_by or instantiations as volatile qualified types).
Richard
>
>>
>> also please make sure to use an internal function for
>> __builtin_with_access_and_size,
>> I don't think we want to expose this to users - it's an implementation
>> detail.
>
> Okay, will define it as an internal function (add it to internal-fn.def). -:)
>
> Qing
>>
>> Richard.
>>
>>>
>>> Jakub
>>>
>