> On Sep 11, 2020, at 12:18 PM, Segher Boessenkool <seg...@kernel.crashing.org>
> wrote:
>
> On Thu, Sep 10, 2020 at 05:50:40PM -0500, Qing Zhao wrote:
>>>>>> Shrink-wrapped stuff. Quite important for performance. Not something
>>>>>> you can throw away.
>
> ^^^ !!! ^^^
>
>>> Start looking at handle_simple_exit()? targetm.gen_simple_return()…
>>
>> Yes, I have been looking at this since this morning.
>> You are right, we also need to insert zeroing sequence before this
>> simple_return which the current patch missed.
>
> Please run the performance loss numbers again after you have something
> more realistic :-(
Yes, I will collect the performance data with the new patch.
>
>> I am currently try to resolve this issue with the following idea:
>>
>> In the routine “thread_prologue_and_epilogue_insns”, After both
>> “make_epilogue_seq” and “try_shrink_wrapping” finished,
>>
>> Scan every exit block to see whether the last insn is a ANY_RETURN_P(insn),
>> If YES, generate the zero sequence before this RETURN insn.
>>
>> Then we should take care all the exit path that returns.
>>
>> Do you see any issue from this idea?
>
> You need to let the backend decide what to do, for this as well as for
> all other cases. I do not know how often I will have to repeat that.
Yes, the new patch will separate the whole task into two parts:
A. Compute the hard register set based on user option, source code attribute,
data flow information, function abi information,
The result will be “need_zeroed_register_set”, and then pass this hard reg
set to the target hook.
B. Each target will have it’s own implementation of emitting the zeroing
sequence based on the “need_zeroed_register_set”.
>
> There also is separate shrink-wrapping, which you haven't touched on at
> all yet. Joy.
Yes, in addition to shrink-wrapping, I also noticed that there are other places
that generate “simple_return” or “return” that are not in
The epilogue, for example, in “dbr” phase (delay_slots phase), in “mach” phase
(machine reorg phase), etc.
So, only generate zeroing sequence in epilogue is not enough.
Hongjiu and I discussed this more, and we came up with a new implementation, I
will describe this new implementation in another email later.
Thanks.
Qing
>
>
> Segher