Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Richard Sandiford Wed, 16 Sep 2020 23:18:39 -0700

Qing Zhao <qing.z...@oracle.com> writes:
> Segher and Richard, 
>
> Now there are two major concerns from the discussion so far:
>
> 1. (From Richard):  Inserting zero insns should be done after 
> pass_thread_prologue_and_epilogue since later passes (for example, 
> pass_regrename) might introduce new used caller-saved registers. 
>      So, we should do this in the beginning of pass_late_compilation (some 
> targets wouldn’t cope with doing it later). 
>
> 2. (From Segher): The inserted zero insns should stay together with the 
> return, no other insns should move in-between zero insns and return insns. 
> Otherwise, a valid gadget could be formed. 
>
> I think that both of the above 2 concerns are important and should be 
> addressed for the correct implementation. 
>
> In order to support 1,  we cannot implementing it in “targetm.gen_return()” 
> and “targetm.gen_simple_return()”  since “targetm.gen_return()” and 
> “targetm.gen_simple_return()” are called during 
> pass_thread_prologue_and_epilogue, at that time, the use information still 
> not correct. 
>
> In order to support 2, enhancing EPILOGUE_USES to include the zeroed 
> registgers is NOT enough to prevent all the zero insns from moving around.


Right.  The purpose of EPILOGUE_USES was instead to stop the moves from
being deleted as dead.

> More restrictions need to be added to these new zero insns.  (I think that 
> marking these new zeroed registers as “unspec_volatile” at RTL level is 
> necessary to prevent them from deleting from moving around). 
>
>
> So, based on the above, I propose the following approach that will resolve 
> the above 2 concerns:
>
> 1. Add 2 new target hooks:
>    A. targetm.pro_epilogue_use (reg)
>    This hook should return a UNSPEC_VOLATILE rtx to mark a register in use to
>    prevent deleting register setting instructions in prologue and epilogue.
>
>    B. targetm.gen_zero_call_used_regs(need_zeroed_hardregs)
>    This hook will gen a sequence of zeroing insns that zero the registers 
> that specified in NEED_ZEROED_HARDREGS.
>
>     A default handler of “gen_zero_call_used_regs” could be defined in middle 
> end, which use mov insns to zero registers, and then use 
> “targetm.pro_epilogue_use(reg)” to mark each zeroed registers. 

This sounds like you're going back to using:

(insn 18 16 19 2 (set (reg:SI 1 dx)
        (const_int 0 [0])) "t10.c":11:1 -1
     (nil))
(insn 19 18 20 2 (unspec_volatile [
            (reg:SI 1 dx)
        ] UNSPECV_PRO_EPILOGUE_USE) "t10.c":11:1 -1
     (nil))

This also doesn't prevent the zeroing from being moved around.  Like the
EPILOGUE_USES approach, it only prevents the clearing from being removed
as dead.  I still think that the EPILOGUE_USES approach is the better
way of doing that.

In other words: the use insns themselves are volatile and so can't be
moved relative to each other and to other volatile insns.  But the uses
are fake instructions that don't do anything.  The preceding zeroing
instructions are just normal instructions that can be moved around
freely before their respective uses.

I don't think there's a foolproof way of preventing an unknown target
machine_reorg pass from moving the instructions around.  But since we
don't have unknown machine_reorgs (at least not in-tree), I think
instead we should be prepared to patch machine_reorgs where necessary
to ensure that they do the right thing.

If you want to increase the chances that machine_reorgs don't need to be
patched, you could either:

(a) to make the zeroing instructions themselves volatile or
(b) to insert a volatile reference to the register before (rather than
    after) the zeroing instruction

IMO (b) is the way to go, because it avoids the need to define special
volatile move patterns for each type of register.  (b) would be needed
on top of (rather than instead of) the EPILOGUE_USES thing.

I don't think we need a new target-specific unspec_volatile code to do (b).
We can just use an automatically-generated volatile asm to clobber the
registers first.  See e.g. how expand_asm_memory_blockage handles memory
scheduling barriers.

Thanks,
Richard

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Reply via email to