> On Sep 3, 2020, at 12:13 PM, Kees Cook <keesc...@chromium.org> wrote:
> 
> On Thu, Sep 03, 2020 at 09:29:54AM -0500, Qing Zhao wrote:
>> On average, all the options starting with “used_…”  (i.e, only the registers 
>> that are used in the routine will be zeroed) have very low runtime 
>> overheads, at most 1.72% for integer benchmarks, and 1.17% for FP 
>> benchmarks. 
>> If all the registers will be zeroed, the runtime overhead is bigger, all_arg 
>> is 5.7%, all_gpr is 3.5%, and all is 17.56% for integer benchmarks on 
>> average. 
>> Looks like the overhead of zeroing vector registers is much bigger. 
>> 
>> For ROP mitigation, -fzero-call-used-regs=used-gpr-arg should be enough, the 
>> runtime overhead with this is very small.
> 
> That looks great; thanks for doing those tests!
> 
> (And it seems like these benchmarks are kind of a "worst case" scenario
> with regard to performance, yes? As in it's mostly tight call loops?)

The top 3 benchmarks that have the most overhead from this option are: 
531.deepsjeng_r, 541.leela_r, and 511.povray_r.
All of them are C++ benchmarks. 
I guess that the most important reason is  the smaller routine size in general 
(especially at the hot execution path or loops).
As a result, the overhead of these additional zeroing instructions in each 
routine will be relatively higher.  

Qing

> 
> -- 
> Kees Cook

Reply via email to