On Sun, Jan 10, 2021 at 10:32:47PM +0800, kernel test robot wrote:
> 
> Greeting,
> 
> FYI, we noticed a -18.4% regression of reaim.jobs_per_min due to commit:
> 
> 
> commit: 2b0d3d3e4fcfb19d10f9a82910b8f0f05c56ee3e ("percpu_ref: reduce memory 
> footprint of percpu_ref in fast path")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> 
> in testcase: reaim
> on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz 
> with 192G memory
> with following parameters:
> 
>       runtime: 300s
>       nr_task: 100%
>       test: short
>       cpufreq_governor: performance
>       ucode: 0x5002f01
> 
> test-description: REAIM is an updated and improved version of AIM 7 benchmark.
> test-url: https://sourceforge.net/projects/re-aim-7/
> 
> In addition to that, the commit also has significant impact on the following 
> tests:
> 
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | vm-scalability: vm-scalability.throughput -2.8% 
> regression                |
> | test machine     | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz 
> with 192G memory |
> | test parameters  | cpufreq_governor=performance                             
>                  |
> |                  | runtime=300s                                             
>                  |
> |                  | test=lru-file-mmap-read-rand                             
>                  |
> |                  | ucode=0x5003003                                          
>                  |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_process_ops 14.5% 
> improvement            |
> | test machine     | 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz 
> with 512G memory    |
> | test parameters  | cpufreq_governor=performance                             
>                  |
> |                  | mode=process                                             
>                  |
> |                  | nr_task=50%                                              
>                  |
> |                  | test=page_fault2                                         
>                  |
> |                  | ucode=0x16                                               
>                  |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_process_ops -13.0% 
> regression            |
> | test machine     | 104 threads Skylake with 192G memory                     
>                  |
> | test parameters  | cpufreq_governor=performance                             
>                  |
> |                  | mode=process                                             
>                  |
> |                  | nr_task=50%                                              
>                  |
> |                  | test=malloc1                                             
>                  |
> |                  | ucode=0x2006906                                          
>                  |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | vm-scalability: vm-scalability.throughput -2.3% 
> regression                |
> | test machine     | 96 threads Intel(R) Xeon(R) CPU @ 2.30GHz with 128G 
> memory                |
> | test parameters  | cpufreq_governor=performance                             
>                  |
> |                  | runtime=300s                                             
>                  |
> |                  | test=lru-file-mmap-read-rand                             
>                  |
> |                  | ucode=0x5002f01                                          
>                  |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | fio-basic: fio.read_iops -4.8% regression                
>                  |
> | test machine     | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz 
> with 192G memory |
> | test parameters  | bs=4k                                                    
>                  |
> |                  | cpufreq_governor=performance                             
>                  |
> |                  | disk=2pmem                                               
>                  |
> |                  | fs=xfs                                                   
>                  |
> |                  | ioengine=libaio                                          
>                  |
> |                  | nr_task=50%                                              
>                  |
> |                  | runtime=200s                                             
>                  |
> |                  | rw=randread                                              
>                  |
> |                  | test_size=200G                                           
>                  |
> |                  | time_based=tb                                            
>                  |
> |                  | ucode=0x5002f01                                          
>                  |
> +------------------+---------------------------------------------------------------------------+
> | testcase: change | stress-ng: stress-ng.stackmmap.ops_per_sec -45.4% 
> regression              |
> | test machine     | 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 
> 256G memory      |
> | test parameters  | class=memory                                             
>                  |
> |                  | cpufreq_governor=performance                             
>                  |
> |                  | disk=1HDD                                                
>                  |
> |                  | nr_threads=100%                                          
>                  |
> |                  | testtime=10s                                             
>                  |
> |                  | ucode=0x5002f01                                          
>                  |
> +------------------+---------------------------------------------------------------------------+

Just run a quick test of the last two on 2b0d3d3e4fcf ("percpu_ref: reduce 
memory footprint of
percpu_ref in fast path) and cf785af19319 ("block: warn if 
!__GFP_DIRECT_RECLAIM in bio_crypt_set_ctx()").

Not see difference in the two kernel(fio on null_blk with 224 hw queues,
and 'stress-ng --stackmmap-ops') on one 224 cores, dual sockets system.

BTW this patch itself doesn't touch fast path code, so it is supposed to
not affect performance.

Can you double check if the test itself is good?

Note: cf785af19319 is 2b0d3d3e4fcf^



Thanks,
Ming

Reply via email to