From: Qing Zhao <qing.z...@oracle.com>
Date: Friday, September 4, 2020 at 9:19 AM
To: "Rodriguez Bahena, Victor" <victor.rodriguez.bah...@intel.com>, Kees Cook 
<keesc...@chromium.org>
Cc: Segher Boessenkool <seg...@kernel.crashing.org>, Jakub Jelinek 
<ja...@redhat.com>, Uros Bizjak <ubiz...@gmail.com>, GCC Patches 
<gcc-patches@gcc.gnu.org>
Subject: Re: PING [Patch][Middle-end]Add 
-fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]




On Sep 3, 2020, at 8:23 PM, Rodriguez Bahena, Victor 
<victor.rodriguez.bah...@intel.com<mailto:victor.rodriguez.bah...@intel.com>> 
wrote:



-----Original Message-----
From: Qing Zhao <qing.z...@oracle.com<mailto:qing.z...@oracle.com>>
Date: Thursday, September 3, 2020 at 12:55 PM
To: Kees Cook <keesc...@chromium.org<mailto:keesc...@chromium.org>>
Cc: Segher Boessenkool 
<seg...@kernel.crashing.org<mailto:seg...@kernel.crashing.org>>, Jakub Jelinek 
<ja...@redhat.com<mailto:ja...@redhat.com>>, Uros Bizjak 
<ubiz...@gmail.com<mailto:ubiz...@gmail.com>>, "Rodriguez Bahena, Victor" 
<victor.rodriguez.bah...@intel.com<mailto:victor.rodriguez.bah...@intel.com>>, 
GCC Patches <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>>
Subject: Re: PING [Patch][Middle-end]Add 
-fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]




On Sep 3, 2020, at 12:13 PM, Kees Cook 
<keesc...@chromium.org<mailto:keesc...@chromium.org>> wrote:

On Thu, Sep 03, 2020 at 09:29:54AM -0500, Qing Zhao wrote:

On average, all the options starting with “used_…”  (i.e, only the registers 
that are used in the routine will be zeroed) have very low runtime overheads, 
at most 1.72% for integer benchmarks, and 1.17% for FP benchmarks.
If all the registers will be zeroed, the runtime overhead is bigger, all_arg is 
5.7%, all_gpr is 3.5%, and all is 17.56% for integer benchmarks on average.
Looks like the overhead of zeroing vector registers is much bigger.

For ROP mitigation, -fzero-call-used-regs=used-gpr-arg should be enough, the 
runtime overhead with this is very small.

That looks great; thanks for doing those tests!

(And it seems like these benchmarks are kind of a "worst case" scenario
with regard to performance, yes? As in it's mostly tight call loops?)

   The top 3 benchmarks that have the most overhead from this option are: 
531.deepsjeng_r, 541.leela_r, and 511.povray_r.
   All of them are C++ benchmarks.
   I guess that the most important reason is  the smaller routine size in 
general (especially at the hot execution path or loops).
   As a result, the overhead of these additional zeroing instructions in each 
routine will be relatively higher.

   Qing

I think that overhead is expected in benchmarks like 541.leela_r, according to 
https://urldefense.com/v3/__https://www.spec.org/cpu2017/Docs/benchmarks/541.leela_r.html__;!!GqivPVa7Brio!I4c2wyzrNGbeOTsX7BSD-4C9Cv3ypQ4N1qfRzSK__STxRGa5M4VarBKof2ak8-dT$<https://urldefense.com/v3/__https:/www.spec.org/cpu2017/Docs/benchmarks/541.leela_r.html__;!!GqivPVa7Brio!I4c2wyzrNGbeOTsX7BSD-4C9Cv3ypQ4N1qfRzSK__STxRGa5M4VarBKof2ak8-dT$>
  is a benchmark for Artificial Intelligence (Monte Carlo simulation, game tree 
search & pattern recognition). The addition of fzero-call-used-regs will 
represent an overhead each time the functions are being call and in areas like 
game tree search is high.

Qing, thanks a lot for the measurement, I am not sure if this is the limit of 
overhead the community is willing to accept by adding extra security (me as gcc 
user will be willing to accept).

From the performance data, we can see that the runtime overhead of clearing 
only_used registers is very reasonable, even for 541.leela_r, 531.deepsjent_r, 
and 511.povray.   If try to clear all registers whatever used or not in the 
current routine, the overhead will be increased dramatically.

So, my question is:

From the security point of view, does clearing ALL registers have more benefit 
than clearing USED registers?
From my understanding, clearing registers that are not used in the current 
routine does NOT provide additional benefit, correct me if I am wrong here.

You are right, it does not provide additional security


Thanks.

Qing



Regards

Victor




--
Kees Cook


Reply via email to