> On Sep 11, 2020, at 12:32 PM, Richard Sandiford <richard.sandif...@arm.com>
> wrote:
>
> Qing Zhao <qing.z...@oracle.com> writes:
>>> On Sep 11, 2020, at 11:14 AM, Segher Boessenkool
>>> <seg...@kernel.crashing.org> wrote:
>>>
>>> On Fri, Sep 11, 2020 at 11:06:03AM +0100, Richard Sandiford wrote:
>>>> This might have already been discussed/answered, sorry, but:
>>>> when there's a choice, is there an obvious winner between:
>>>>
>>>> (1) clearing call-clobbered registers and then restoring call-preserved
>>>> ones
>>>> (2) restoring call-preserved registers and then clearing call-clobbered
>>>> ones
>>>>
>>>> Is one option more likely to be useful to attackers than the other?
>>
>> for mitigating ROP purpose, I think that (2) is better than (1). i.e, the
>> clearing
>> call-clobbered register sequence will be immediately before “ret”
>> instruction.
>> This will prevent the gadget from doing any useful things.
>
> OK. The reason I was asking was that (from the naive perspective of
> someone not well versed in this stuff): if the effect of one of the
> register restores is itself a useful gadget, the clearing wouldn't
> protect against it. But if the register restores are not part of the
> intended effect, it seemed that having them immediately before the
> ret might make the gadget harder to use than clearing registers would,
> because the side-effects of restores would be harder to control than the
> (predictable) effect of clearing registers.
>
> But like I say, this is very much not my area of expertise, so that's
> probably missing the point in a major way. ;-)
I am not an expert on the security area either. :-)
My understanding of how this scheme helps ROP is: the attacker usually uses
scratch register to pass
parameters to the sys call in the gadget, if clearing the scratch registers
immediately before “ret”, then
The parameters that are passed to sys call will be destroyed, therefore, the
attack will likely failed.
So, clearing the scratch registers immediately before “ret” will be very
helpful to mitigate ROP.
>
> I think the original patch plugged into pass_thread_prologue_and_epilogue,
> is that right?
Yes.
> If we go for (2), then I think it would be better to do
> it at the start of pass_late_compilation instead. (Some targets wouldn't
> cope with doing it later.) The reason for doing it so late is that the
> set of used “volatile”/caller-saved registers is not fixed at prologue
> and epilogue generation: later optimisation passes can introduce uses
> of volatile registers that weren't used previously. (Sorry if this
> has already been suggested.)
Yes, I agree.
I thought that it might be better to move this task at the very late of the RTL
stage, i.e, before “final” phase.
Another solution is (discussed with Hongjiu):
1. Define a new target hook:
targetm.return_with_zeroing(bool simple_return_p, HARD_REG_SET
need_zeroed_hardregs, bool gpr_only)
2. Add the following routine in middle end:
rtx_insn *
generate_return_rtx (bool simple_return_p)
{
if (targetm.return_with_zeroing)
{
Compute the hardregs set for clearing into “need_zeroed_hardregs”;
return targetm.return_with_zeroing (simple_return_p, need_zeroed_hardregs,
gpr_only);
}
else
{
if (simple_return_p)
return targetm.gen_simple_return ( );
else
return targetm.gen_return ();
}
}
Then replace all call to “targetm.gen_simple_return” and “targetm.gen_return”
to “generate_return_rtx()”.
3. In the target,
Implement “return_with_zeroing”.
Let me know your comments on this.
Thanks a lot.
Qing
>
> Unlike Segher, I think this can/should be done in target-independent
> code as far as possible (like the patch seemed to do).
>
> Thanks,
> Richard