Hi,
Looks like both attached .csv files were deleted during the email delivery procedure. Not sure what’s the reason for this. Then I have to copy the text file here for you reference: ****benchmarks: C 500.perlbench_r C 502.gcc_r C 505.mcf_r C++ 520.omnetpp_r C++ 523.xalancbmk_r C 525.x264_r C++ 531.deepsjeng_r C++ 541.leela_r C 557.xz_r C++/C/Fortran 507.cactuBSSN_r C++ 508.namd_r C++ 510.parest_r C++/C 511.povray_r C 519.lbm_r Fortran/C 521.wrf_r C++/C 526.blender_r Fortran/C 527.cam4_r C 538.imagick_r C 544.nab_r ***runtime overhead data and code size overhead data, I converted then to PDF files, hopefully this time I can attach it with the email: thanks. Qing > On Sep 3, 2020, at 9:29 AM, Qing Zhao via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > Hi, > > Per request, I collected runtime performance data and code size data with > CPU2017 on a X86 platform. > > *** Machine info: > model name>-----: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz > $ lscpu | grep NUMA > NUMA node(s): 2 > NUMA node0 CPU(s): 0-21,44-65 > NUMA node1 CPU(s): 22-43,66-87 > > ***CPU2017 benchmarks: > all the benchmarks with C/C++, 9 Integer benchmarks, 10 FP benchmarks. > > ***Configures: > Intrate and fprate, 22 copies. > > ***Compiler options: > no : -g -O2 -march=native > used_gpr_arg: no + -fzero-call-used-regs=used-gpr-arg > used_arg: no + -fzero-call-used-regs=used-arg > all_arg: no + -fzero-call-used-regs=all-arg > used_gpr: no + -fzero-call-used-regs=used-gpr > all_gpr: no + -fzero-call-used-regs=all-gpr > used: no + -fzero-call-used-regs=used > all: no + -fzero-call-used-regs=all > > ***each benchmark runs 3 times. > > ***runtime performance data: > Please see the attached csv file > > > From the data, we can see that: > On average, all the options starting with “used_…” (i.e, only the registers > that are used in the routine will be zeroed) have very low runtime overheads, > at most 1.72% for integer benchmarks, and 1.17% for FP benchmarks. > If all the registers will be zeroed, the runtime overhead is bigger, all_arg > is 5.7%, all_gpr is 3.5%, and all is 17.56% for integer benchmarks on > average. > Looks like the overhead of zeroing vector registers is much bigger. > > For ROP mitigation, -fzero-call-used-regs=used-gpr-arg should be enough, the > runtime overhead with this is very small. > > ***code size increase data: > > Please see the attached file > > > From the data, we can see that: > The code size impact in general is very small, the biggest is “all_arg”, > which is 1.06% for integer benchmark, and 1.13% for FP benchmarks. > > So, from the data collected, I think that the run-time overhead and code size > increase from this option are very reasonable. > > Let me know you comments and opinions. > > thanks. > > Qing > >> On Aug 25, 2020, at 4:54 PM, Qing Zhao via Gcc-patches >> <gcc-patches@gcc.gnu.org> wrote: >> >> >> >>> On Aug 24, 2020, at 3:20 PM, Segher Boessenkool >>> <seg...@kernel.crashing.org> wrote: >>> >>> Hi! >>> >>> On Mon, Aug 24, 2020 at 01:02:03PM -0500, Qing Zhao wrote: >>>>> On Aug 24, 2020, at 12:49 PM, Segher Boessenkool >>>>> <seg...@kernel.crashing.org> wrote: >>>>> On Wed, Aug 19, 2020 at 06:27:45PM -0500, Qing Zhao wrote: >>>>>>> On Aug 19, 2020, at 5:57 PM, Segher Boessenkool >>>>>>> <seg...@kernel.crashing.org> wrote: >>>>>>> Numbers on how expensive this is (for what arch, in code size and in >>>>>>> execution time) would be useful. If it is so expensive that no one will >>>>>>> use it, it helps security at most none at all :-( >>>>> >>>>> Without numbers on this, no one can determine if it is a good tradeoff >>>>> for them. And we (the GCC people) cannot know if it will be useful for >>>>> enough users that it will be worth the effort for us. Which is why I >>>>> keep hammering on this point. >>>> I can collect some run-time overhead data on this, do you have a >>>> recommendation on what test suite I can use >>>> For this testing? (Is CPU2017 good enough)? >>> >>> I would use something more real-life, not 12 small pieces of code. >> >> There is some basic information about the benchmarks of CPU2017 in below >> link: >> >> https://urldefense.com/v3/__https://www.spec.org/cpu2017/Docs/overview.html*suites__;Iw!!GqivPVa7Brio!PmRE_sLg10gVnn1UZLs1q1TPoTV0SCnw0Foo5QQlZgD03MeL0KIyPVXl0XlvVVRP$<https://urldefense.com/v3/__https://www.spec.org/cpu2017/Docs/overview.html*suites__;Iw!!GqivPVa7Brio!PmRE_sLg10gVnn1UZLs1q1TPoTV0SCnw0Foo5QQlZgD03MeL0KIyPVXl0XlvVVRP$> >> >> <https://urldefense.com/v3/__https://www.spec.org/cpu2017/Docs/overview.html*suites__;Iw!!GqivPVa7Brio!PmRE_sLg10gVnn1UZLs1q1TPoTV0SCnw0Foo5QQlZgD03MeL0KIyPVXl0XlvVVRP$<https://urldefense.com/v3/__https://www.spec.org/cpu2017/Docs/overview.html*suites__;Iw!!GqivPVa7Brio!PmRE_sLg10gVnn1UZLs1q1TPoTV0SCnw0Foo5QQlZgD03MeL0KIyPVXl0XlvVVRP$> >> > >> >> GCC itself is one of the benchmarks in CPU2017 (502.gcc_r). And >> 526.blender_r is even larger than 502.gcc_r. >> And there are several other quite big benchmarks as well (perlbench, >> xalancbmk, parest, imagick, etc). >> >> thanks. >> >> Qing