Segher and Richard, 

Now there are two major concerns from the discussion so far:

1. (From Richard):  Inserting zero insns should be done after 
pass_thread_prologue_and_epilogue since later passes (for example, 
pass_regrename) might introduce new used caller-saved registers. 
     So, we should do this in the beginning of pass_late_compilation (some 
targets wouldn’t cope with doing it later). 

2. (From Segher): The inserted zero insns should stay together with the return, 
no other insns should move in-between zero insns and return insns. Otherwise, a 
valid gadget could be formed. 

I think that both of the above 2 concerns are important and should be addressed 
for the correct implementation. 

In order to support 1,  we cannot implementing it in “targetm.gen_return()” and 
“targetm.gen_simple_return()”  since “targetm.gen_return()” and 
“targetm.gen_simple_return()” are called during 
pass_thread_prologue_and_epilogue, at that time, the use information still not 
correct. 

In order to support 2, enhancing EPILOGUE_USES to include the zeroed registgers 
is NOT enough to prevent all the zero insns from moving around.  More 
restrictions need to be added to these new zero insns.  (I think that marking 
these new zeroed registers as “unspec_volatile” at RTL level is necessary to 
prevent them from deleting from moving around). 


So, based on the above, I propose the following approach that will resolve the 
above 2 concerns:

1. Add 2 new target hooks:
   A. targetm.pro_epilogue_use (reg)
   This hook should return a UNSPEC_VOLATILE rtx to mark a register in use to
   prevent deleting register setting instructions in prologue and epilogue.

   B. targetm.gen_zero_call_used_regs(need_zeroed_hardregs)
   This hook will gen a sequence of zeroing insns that zero the registers that 
specified in NEED_ZEROED_HARDREGS.

    A default handler of “gen_zero_call_used_regs” could be defined in middle 
end, which use mov insns to zero registers, and then use 
“targetm.pro_epilogue_use(reg)” to mark each zeroed registers. 


2. Add  a new pass, pass_zero_call_used_regs,  in the beginning of 
pass_late_compilation. 

    This pass will search all “return”s, and compute the hard register set for 
zeroing, “need_zeroed_hardregs”, based on data flow information, user request, 
and function abi. 
    Then call targetm.gen_zero_call_used_regs(need_zeroed_hardregs).

3. X86 backend will implement a special version for “gen_zero_call_used_regs”, 
and “pro_epilogue_use”.


Let me know if you have any more comment on this approach.

thanks.

Qing




> On Sep 16, 2020, at 5:35 AM, Segher Boessenkool <seg...@kernel.crashing.org> 
> wrote:
> 
> On Tue, Sep 15, 2020 at 08:51:57PM -0500, Qing Zhao wrote:
>>> On Sep 15, 2020, at 6:09 PM, Segher Boessenkool 
>>> <seg...@kernel.crashing.org> wrote:
>>> If you want the zeroing insns to stay with the return, you have to
>>> express that in RTL.  
>> 
>> What do you mean by “express that in RTL”?
>> Could you please explain this in more details?
> 
> Exactly as I say: you need to tell in the RTL that the insns should stay
> together.
> 
> Easiest is to just make it one RTL insn.  There are other ways, but
> those do not help anything here afaics.
> 
>> Do you mean to implement this in “targetm.gen_return” and 
>> “targetm.gen_simple_return”?
> 
> That is the easiest way, yes.
> 
>>> Anything else is extremely fragile.
> 
> 
> Segher

Reply via email to