Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Qing Zhao via Gcc-patches Thu, 03 Sep 2020 08:09:41 -0700

Hi,


Looks like both attached .csv files were deleted during the email delivery 
procedure. Not sure what’s the reason for this.

Then I have to copy the text file here for you reference:

****benchmarks:
C       500.perlbench_r  
C       502.gcc_r     
C       505.mcf_r       
C++     520.omnetpp_r    
C++     523.xalancbmk_r  
C       525.x264_r        
C++     531.deepsjeng_r    
C++     541.leela_r        
C       557.xz_r       
                      

C++/C/Fortran   507.cactuBSSN_r      
C++     508.namd_r    
C++     510.parest_r     
C++/C   511.povray_r   
C       519.lbm_r     
Fortran/C       521.wrf_r 
C++/C   526.blender_r   
Fortran/C       527.cam4_r  
C       538.imagick_r  
C       544.nab_r    

***runtime overhead data and code size overhead data, I converted then to PDF 
files, hopefully this time I can attach it with the email:

thanks.

Qing






> On Sep 3, 2020, at 9:29 AM, Qing Zhao via Gcc-patches 
> <gcc-patches@gcc.gnu.org> wrote:
> 
> Hi,
> 
> Per request, I collected runtime performance data and code size data with 
> CPU2017 on a X86 platform. 
> 
> *** Machine info:
> model name>-----: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
> $ lscpu | grep NUMA
> NUMA node(s):          2
> NUMA node0 CPU(s):     0-21,44-65
> NUMA node1 CPU(s):     22-43,66-87
> 
> ***CPU2017 benchmarks: 
> all the benchmarks with C/C++, 9 Integer benchmarks, 10 FP benchmarks. 
> 
> ***Configures:
> Intrate and fprate, 22 copies. 
> 
> ***Compiler options:
> no :                          -g -O2 -march=native
> used_gpr_arg:         no + -fzero-call-used-regs=used-gpr-arg
> used_arg:             no + -fzero-call-used-regs=used-arg
> all_arg:                      no + -fzero-call-used-regs=all-arg
> used_gpr:             no + -fzero-call-used-regs=used-gpr
> all_gpr:                      no + -fzero-call-used-regs=all-gpr
> used:                 no + -fzero-call-used-regs=used
> all:                          no + -fzero-call-used-regs=all
> 
> ***each benchmark runs 3 times. 
> 
> ***runtime performance data:
> Please see the attached csv file
> 
> 
> From the data, we can see that:
> On average, all the options starting with “used_…”  (i.e, only the registers 
> that are used in the routine will be zeroed) have very low runtime overheads, 
> at most 1.72% for integer benchmarks, and 1.17% for FP benchmarks. 
> If all the registers will be zeroed, the runtime overhead is bigger, all_arg 
> is 5.7%, all_gpr is 3.5%, and all is 17.56% for integer benchmarks on 
> average. 
> Looks like the overhead of zeroing vector registers is much bigger. 
> 
> For ROP mitigation, -fzero-call-used-regs=used-gpr-arg should be enough, the 
> runtime overhead with this is very small.
> 
> ***code size increase data:
> 
> Please see the attached file 
> 
> 
> From the data, we can see that:
> The code size impact in general is very small, the biggest is “all_arg”, 
> which is 1.06% for integer benchmark, and 1.13% for FP benchmarks.
> 
> So, from the data collected, I think that the run-time overhead and code size 
> increase from this option are very reasonable. 
> 
> Let me know you comments and opinions.
> 
> thanks.
> 
> Qing
> 
>> On Aug 25, 2020, at 4:54 PM, Qing Zhao via Gcc-patches 
>> <gcc-patches@gcc.gnu.org> wrote:
>> 
>> 
>> 
>>> On Aug 24, 2020, at 3:20 PM, Segher Boessenkool 
>>> <seg...@kernel.crashing.org> wrote:
>>> 
>>> Hi!
>>> 
>>> On Mon, Aug 24, 2020 at 01:02:03PM -0500, Qing Zhao wrote:
>>>>> On Aug 24, 2020, at 12:49 PM, Segher Boessenkool 
>>>>> <seg...@kernel.crashing.org> wrote:
>>>>> On Wed, Aug 19, 2020 at 06:27:45PM -0500, Qing Zhao wrote:
>>>>>>> On Aug 19, 2020, at 5:57 PM, Segher Boessenkool 
>>>>>>> <seg...@kernel.crashing.org> wrote:
>>>>>>> Numbers on how expensive this is (for what arch, in code size and in
>>>>>>> execution time) would be useful.  If it is so expensive that no one will
>>>>>>> use it, it helps security at most none at all :-(
>>>>> 
>>>>> Without numbers on this, no one can determine if it is a good tradeoff
>>>>> for them.  And we (the GCC people) cannot know if it will be useful for
>>>>> enough users that it will be worth the effort for us.  Which is why I
>>>>> keep hammering on this point.
>>>> I can collect some run-time overhead data on this, do you have a 
>>>> recommendation on what test suite I can use
>>>> For this testing? (Is CPU2017 good enough)?
>>> 
>>> I would use something more real-life, not 12 small pieces of code.
>> 
>> There is some basic information about the benchmarks of CPU2017 in below 
>> link:
>> 
>> https://urldefense.com/v3/__https://www.spec.org/cpu2017/Docs/overview.html*suites__;Iw!!GqivPVa7Brio!PmRE_sLg10gVnn1UZLs1q1TPoTV0SCnw0Foo5QQlZgD03MeL0KIyPVXl0XlvVVRP$<https://urldefense.com/v3/__https://www.spec.org/cpu2017/Docs/overview.html*suites__;Iw!!GqivPVa7Brio!PmRE_sLg10gVnn1UZLs1q1TPoTV0SCnw0Foo5QQlZgD03MeL0KIyPVXl0XlvVVRP$>
>>  
>> <https://urldefense.com/v3/__https://www.spec.org/cpu2017/Docs/overview.html*suites__;Iw!!GqivPVa7Brio!PmRE_sLg10gVnn1UZLs1q1TPoTV0SCnw0Foo5QQlZgD03MeL0KIyPVXl0XlvVVRP$<https://urldefense.com/v3/__https://www.spec.org/cpu2017/Docs/overview.html*suites__;Iw!!GqivPVa7Brio!PmRE_sLg10gVnn1UZLs1q1TPoTV0SCnw0Foo5QQlZgD03MeL0KIyPVXl0XlvVVRP$>
>>  >
>> 
>> GCC itself is one of the benchmarks in CPU2017 (502.gcc_r). And 
>> 526.blender_r is even larger than 502.gcc_r. 
>> And there are several other quite big benchmarks as well (perlbench, 
>> xalancbmk, parest, imagick, etc).
>> 
>> thanks.
>> 
>> Qing

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Reply via email to